Analyzing data using Apache Spark™

Spark is the default mode when you start an analytics node in a packaged installation.

About Spark

Information about Spark architecture and capabilities.

Using Spark with DataStax Enterprise

DataStax Enterprise integrates with Apache Spark to allow distributed analytic applications to run using database data.

Configuring Spark

Configuring Spark includes setting Spark properties for DataStax Enterprise and the database, enabling Spark apps, and setting permissions.

Using Spark modules with DataStax Enterprise

Spark Streaming, Spark SQL, and MLlib are modules that extend the capabilities of Spark.

Accessing DataStax Enterprise data from external Spark clusters

Information on accessing data in DataStax Enterprise clusters from external Spark clusters, or Bring Your Own Spark (BYOS).

Using the Spark Jobserver

DSE includes Spark Jobserver, a REST interface for submitting and managing Spark jobs.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com