Configuring Apache Spark™

Configuring Spark for DataStax Enterprise includes:

Configuring Spark nodes

Modify the settings for Spark nodes security, performance, and logging.

Automatic Spark Master election

Spark Master elections are automatically managed.

Configuring Spark logging options

Configure Spark logging options.

Running Spark processes as separate users

Spark processes can be configured to run as separate operating system users.

Configuring the Spark history server

Load the event logs from Spark jobs that were run with event logging enabled.

Enabling Spark apps in cluster mode when authentication is enabled

Configuration steps to enable Spark applications in cluster mode when JAR files are on the Cassandra file system (CFS) and authentication is enabled.

Setting Spark Cassandra Connector-specific properties

Use the Spark Cassandra Connector options to configure DataStax Enterprise Spark.

Creating a DSE Analytics Solo datacenter

DSE Analytics Solo datacenters do not store any database or search data, but are strictly used for analytics processing. They are used in conjunction with one or more datacenters that contain database data.

Spark JVMs and memory management

Spark jobs running on DataStax Enterprise are divided among several different JVM processes.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com