Monitoring Spark with Spark Performance Objects

The Performance Service can collect data associated with Spark cluster and Spark applications and save it to a table. This allows monitoring the metrics for DSE Analytics applications for performance tuning and bottlenecks.

If authorization is enabled in your cluster, you must grant the user who is running the Spark application SELECT permissions to the dse_system.spark_metrics_config table, and MODIFY permissions to the dse_perf.spark_apps_snapshot.

Monitoring Spark cluster information

The Performance Service stores information about DSE Analytics clusters in the dse_perf.spark_cluster_snapshot table. The cluster performance objects store the available and used resources in the cluster, including cores, memory, and workers, as well as overall information about all registered Spark applications, drivers and executors, including the number of applications, the state of each application, and the host on which the application is running.

To enable collecting Spark cluster information, configure the options in the spark_cluster_info_options section of dse.yaml.

Spark cluster info options
Option Default value Description

enabled

false

Enables or disables Spark cluster information collection.

refresh_rate_ms

10,000

The time in milliseconds in which the data will be collected and stored.

The dse_perf.spark_cluster_snapshot table has the following columns:

name

The cluster name.

active_apps

The number of applications active in the cluster.

active_drivers

The number of active drivers in the cluster.

completed_apps

The number of completed applications in the cluster.

completed_drivers

The number of completed drivers in the cluster.

executors

The number of Spark executors in the cluster.

master_address

The host name and port number of the Spark Master node.

master_recovery_state

The state of the master node.

nodes

The number of nodes in the cluster.

total_cores

The total number of cores available on all the nodes in the cluster.

total_memory_mb

The total amount of memory in megabytes (MB) available to the cluster.

used_cores

The total number of cores currently used by the cluster.

used_memory_mb

The total amount of memory in megabytes (MB) used by the cluster.

workers

The total number of Spark Workers in the cluster.

Monitoring Spark application information

Spark application performance information is stored per application and updated whenever a task is finished. It is stored in the dse_perf.spark_apps_snapshot table.

To enable collecting Spark application information, configure the options in the spark_application_info_options section of dse.yaml.

Spark application information options
Option Default Description

enabled

false

Enables or disables collecting Spark application information.

refresh_rate_ms

10,000

The time in milliseconds in which the data will be collected and stored.

The driver subsection of spark_application_info_options controls the metrics that are collected by the Spark Driver.

Spark Driver information options
Option Default Description

sink

false

Enables or disables collecting metrics from the Spark Driver.

connectorSource

false

Enables or disables collecting Spark Cassandra Connector metrics.

jvmSource

false

Enables or disables collecting JVM heap and garbage collection metrics from the Spark Driver.

stateSource

false

Enables or disables collecting application state metrics.

The executor subsection of spark_application_info_options controls the metrics collected by the Spark executors.

Spark executor information options
Option Default Description

sink

false

Enables or disables collecting Spark executor metrics.

connectorSource

false

Enables or disables collecting Spark Cassandra Connector metrics from the Spark executors.

jvmSource

false

Enables or disables collecting JVM heap or garbage collection metrics from the Spark executors.

The dse_perf.spark_apps_snapshot table has the following columns:

application_id
component_id
metric_id
count
metric_type
rate_15_min
rate_1_min
rate_5_min
rate_mean
snapshot_75th_percentile
snapshot_95th_percentile
snapshot_98th_percentile
snapshot_999th_percentile
snapshot_99th_percentile
snapshot_max
snapshot_mean
snapshot_median
snapshot_min
snapshot_stddev
value

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com