Alerts

All alerts contain information about each captured event.

Optionally, you can configure Mission Control to send alerts for selected levels of events or specific clusters.

Embedded alert plugins

Mission Control provides support for routing alerts to Slack channels.

Default alerts

A severity label marks the criticality value of an alert.

The three values can be:

  • critical - requires immmediate action

  • warning - requires eventual but not urgent action

  • info - marks something out of the ordinary that doesn’t necessarily require action

Default alerts
Description Severity Database type [1] Details

Node down for more than 10 minutes

Sev 2 - Warning

Both

Source metric: org_apache_cassandra_metrics_thread_pools_completed_tasks.

Node down for 30 minutes

Sev 1 - Error

Both

Source metric: org_apache_cassandra_metrics_thread_pools_completed_tasks.

Nodes down in different racks of same datacenter

Both

Two nodes down across rack boundaries can lead to LOCAL_QUORUM CL errors in applications. Source metric: org_apache_cassandra_metrics_thread_pools_completed_tasks.

CPU above 80% for 5 minutes

Both

An error that, if triggered too often, indicates low disk space and that the cluster should be scaled. Source metric: host_cpu_seconds_total.

Used disk space above 50% for one minute

Both

Used disk space above 75% for one minute

Sev 1 - Error

Both

A signal to expand the cluster before it gets into a state where cleanups are impossible due to insufficient disk space. Source metric: host_filesystem_used_ratio.

Used disk space above 50% for one minute

Sev 2 - Warning

Both

A signal to expand the cluster before it gets into a state where cleanups are impossible due to insufficient disk space. Source metric: host_filesystem_used_ratio.

Load average above 20 for 5 minutes

Sev 2 - Warning

Both

Good indicator for performance issues, the root cause of which can vary. Source metric: host_load5.

Load average above 32 for 5 minutes

Sev 1 - Error

Both

Good indicator for performance issues, the root cause of which can vary. Source metric: host_load5.

Dropped messages over 5 minutes

Sev 1 - Error for >= 10,000 + Sev 2 - Warning for < 10,000

Both

Thread pools cannot keep up with the pace of queries entering and being processed within the cluster. This leads to errors within the application stack and potentially incorrect replicas. Source metric: org_apache_cassandra_metrics_dropped_message_dropped_total.


1. Apache Cassandra® or DataStax Enterprise (DSE) database.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com