Best Practice Rules Reference

Reference of available rules in the Best Practice Service organized in alphabetical order by each Advisor section.

Backup Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Auto Snapshot not enabled	Checks to make sure auto snapshot isn't turned off in production.	High	Node	Daily	Info
Auto Snapshot not enabled	Auto snapshot is not enabled and can lead to data loss on truncation or drop. Please update your cassandra.yaml to enable `auto_snapshot` and prevent data loss. Tip: Use LCM Config Profiles to enable auto_snapshot in the Snapshots section of cassandra.yaml. The auto_snapshot setting is enabled by default in LCM config profiles.	High	Node	Daily	Info
Commit Log Archiving Setting Enabled Consistency Note: This rule is available in OpsCenter versions 6.1 and later.	Commit Log Archiving has been turned off due to inconsistent settings for all nodes in the cluster.	High	Node, Cluster	Hourly	Alert
	Commit Log Archiving is not enabled for all nodes within the cluster, which can result in data loss when performing a Point-in-Time restore. Turn Commit Log Archiving on again so that all nodes in the cluster have the enabled setting consistent for Commit Log Archiving.	High	Node, Cluster	Hourly	Alert

Config Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
NodeSync Not Running Note: This rule is available in OpsCenter versions 6.5 and later.	The NodeSync service is intended to be running on every node in the cluster. If any nodes are not running NodeSync, the data segments for which those nodes are replicas will not be validated and synchronized.	High	Node	Daily	Alert
	Ensure NodeSync is running on every node. Enable manually using nodetool nodesyncservice enable. See Enabling keyspaces and tables for monitoring NodeSync in OpsCenter. The NodeSync Service is enabled by default. If it was disabled for some reason, enable the NodeSync Service again.	High	Node	Daily	Alert
Repair service not enabled	Verifies that the repair service is enabled.	High	Cluster	Daily	Info
Repair service not enabled	Running regular repair ensures data consistency across a cluster. Enable the repair service.	High	Cluster	Daily	Info
Repair service not configured correctly	Verifies that the repair service is configured correctly for your cluster. For more information, see basic, advanced, and expert repair configuration.	High	Cluster	Daily	Info
Repair service not configured correctly	It is recommended to enable the OpsCenter repair service to run within the smallest `gc_grace` window configured on your cluster.	High	Cluster	Daily	Info
Security not enabled for DataStax agents	Checks that OpsCenter authentication is enabled in conjunction with SSL between daemon and agent.	High	Cluster	Daily	Alert
Security not enabled for DataStax agents	Please enable SSL for communicating with agents.	High	Cluster	Daily	Alert
Swap space is enabled	Checks that you do not have swap space enabled on any node. Swap space should not be used in a production environment.	Medium	Node	Daily	Alert
Swap space is enabled	Please disable swap space.	Medium	Node	Daily	Alert
Seed node configuration	In each DC, there should be at least two seed nodes present, if there are at least two nodes present in the DC. IPs should be used rather than hostnames. All nodes should have the same seed list.	Low	Node, Cluster	Daily	Alert
Seed node configuration	To correct this, please use the same seed list of IPs on all nodes. Tip: If using LCM, adjust the seed nodes in the appropriate LCM Config Profiles.	Low	Node, Cluster	Daily	Alert

Network Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Different Listen and RPC Addresses	Checks that if there are multiple network interfaces that Cassandra has been configured to use separate networks for listen and rpc address. Note: When the `listen_address` field in cassandra.yaml file is left blank, OpsCenter agents default to the same listen address as DSE in OpsCenter version 6.1.2 and later.	Medium	Node	Daily	Info
Different Listen and RPC Addresses	Multiple networks have been detected but you are using the same network for client and internal customer communication.	Medium	Node	Daily	Info

OpsCenter Config Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
OpsCenter Failover Enabled	DataStax recommends configuring OpsCenter failover for high availability.	Low	OpsC	Daily	Alert
OpsCenter Failover Enabled	There is no backup OpsCenter configured. Please enable failover for OpsCenter.	Low	OpsC	Daily	Alert

OS Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Clocks in cluster out of sync	Checks that clocks across the cluster are in sync within a 2 second tolerance.	High	Node, Cluster	Daily	Alert
Clocks in cluster out of sync	The total drift across cluster exceeds the tolerance of 2 seconds; please sync clocks on your nodes. Warning: Clock drift can cause issues when LCM attempts to generate SSL certificates. Keeping clocks synchronized is critical to ensure accurate timestamps for database operations and logging.	High	Node, Cluster	Daily	Alert
Cassandra-user and agent-user match	Checks that cassandra and agent are run as the same user.	High	Node	Daily	Alert
Cassandra-user and agent-user match	Cassandra and agent are not run as the same user. Please ensure that Cassandra and agent are run as the same user.	High	Node	Daily	Alert
Clocks in UTC	Checks that clocks across the nodes are in Coordinated Universal Time (UTC).	Low	Node	Daily	Alert
Clocks in UTC	All the nodes are not in Coordinated Universal Time (UTC). Please ensure that all nodes are in UTC.	Low	Node	Daily	Alert
Require Oracle Java	Checks to make sure that Oracle Java is being used on the node.	Medium	Node	Daily	Alert
Require Oracle Java	Unsupported JDK is in use on the node. Oracle/Sun Hotspot JDK is the preferred JDK to use and well-tested in DataStax Enterprise. Switch to Oracle Hotspot JDK if you're currently using OpenJDK (as the default Java environment coming from the Linux OS). Tip: Use LCM Config Profiles to manage Java installations.	Medium	Node	Daily	Alert

Performance Advisor

Rules for read and write to node performance (Performance Advisor not to be confused with the Performance Services).

Tip: Use LCM Config Profiles to adjust request timeout settings in cassandra.yaml settings and run a configuration job.


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Read request timeout not optimal	Checks that the read request timeout on your nodes is not set above recommended values.	Medium	Node	Daily	Alert
Read request timeout not optimal	Significantly increasing the read request timeout on your nodes is not recommended. Please update cassandra.yaml on your nodes and lower the value of read_request_timeout_in_ms. Tip: Set the value in the Timeouts pane of an LCM Config Profiles and run a configure job.
Write request timeout not optimal	Checks that the write request timeout on your nodes is not set above recommended values.
Write request timeout not optimal	Significantly increasing the write request timeout on your nodes is not recommended. Please update cassandra.yaml on your nodes and lower the value of write_request_timeout_in_ms. Tip: Set the value in the Timeouts pane of an LCM Config Profiles and run a configure job.
Range request timeout not optimal	Checks that the range request timeout on your nodes is not set above recommended values.	Medium	Node	Daily	Alert
Range request timeout not optimal	Significantly increasing the range request timeout on your nodes is not recommended. Please update cassandra.yaml on your nodes and lower the value of range_request_timeout_in_ms. Tip: Set the value in the Timeouts pane of an LCM Config Profiles and run a configure job.	Medium	Node	Daily	Alert

Performance Service - Slow Queries Advisor

For more information, see Slow Queries in the Performance Service.


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Use prepared statements	Prepared statements reduce the workload on the coordinator by removing the overhead of parsing the query.	Medium	Cluster	Hourly	Info
Use prepared statements	Use prepared statements for your queries.	Medium	Cluster	Hourly	Info
Avoid ALLOW FILTERING	Checks that ALLOW FILTERING is not used in queries.	Medium	Cluster	Hourly	Info
Avoid ALLOW FILTERING	ALLOW FILTERING causes a query to scan all data within a token range, which might be desired with analytic workloads but is not recommended for non-analytic workloads. ALLOW FILTERING can cause long running queries and consume excessive system resources. If using ALLOW FILTERING outside of an analytics workload, please consider a new data model based on the query pattern instead.	Medium	Cluster	Hourly	Info
Avoid using large batches	Using large batches seems like an optimization but doing so puts extra load on the coordinator, which can cause hotspots in the cluster. Queries run faster after breaking large batches into individual queries and distributing them to different nodes.	Medium	Cluster	Hourly	Info
Avoid using large batches	Break the batches into individual queries and distribute them to different nodes.	Medium	Cluster	Hourly	Info
Use counter instead of count	A count(*) query can be expensive, even with smaller limits.	Medium	Cluster	Hourly	Info
Use counter instead of count	Replace the logic with a counter you maintain.	Medium	Cluster	Hourly	Info
Minimize keys in IN clause	Huge IN clauses give the impression of a singular query but the clauses actually execute as multiple queries.	Medium	Cluster	Hourly	Info
Minimize keys in IN clause	Make individual async queries distributed amongst more coordinators.	Medium	Cluster	Hourly	Info

Performance Service - Table Metrics Advisor

For more information, see Table Metrics in the Performance Service.


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Wide partitions	Checks for excessively wide partitions. Excessively wide partitions have a negative impact on performance and are not recommended. A partition is considered to be wide when the size is greater than 100 MB.	Low	Node, Cluster	Hourly	Alert
Wide partitions	Excessively wide partitions have a negative impact on performance and are not recommended. Consider remodeling your data to break up wide partitions.	Low	Node, Cluster	Hourly	Alert
Secondary indexes cardinality	Checks for secondary indexes with too many distinct values.	Low	Node, Cluster	Hourly	Alert
Secondary indexes cardinality	High-cardinality secondary indexes can have a negative impact on system performance. Consider denormalizing the indexed data.	Low	Node, Cluster	Hourly	Alert
Tombstone count	Number of tombstones processed during reads.	Low	Node, Cluster	Hourly	Alert
Tombstone count	Too many tombstones can cause a degradation of performance. This can even lead to query failures.	Low	Node, Cluster	Hourly	Alert
Compaction Strategy	The compaction strategy you use should be based on your data and environment. This Best Practice rule is set to run so that you are aware of the importance of choosing a compaction strategy. If you have already chosen the correct compaction strategy based on your environment, please disable this rule if you do not want to see a reminder about compaction strategy again.	Low	Cluster	Hourly	Alert
Compaction Strategy	Choose the compaction strategy that best fits your data and environment. See Compaction strategies.	Low	Cluster	Hourly	Alert

Performance Service - Thread Pools Advisor

For more information, see Thread Pool Statistics in the Performance Service.


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Read Stage	Number of pending reads.	Low	Node	Hourly	Alert
Read Stage	Too many pending reads, which could be related to disk problems, poor tuning, or cluster overload. Consider adding new nodes, tuning the system, and revisiting your data model. If not CPU or IO bound, try increasing `concurrent_reads`.	Low	Node	Hourly	Alert
Mutation Stage	Number of pending mutations.	Low	Node	Hourly	Alert
Mutation Stage	Too many pending mutations; which could be related to disk problems, poor tuning, or cluster overload. Please consider adding new nodes, tuning the system, and revisiting your data model. If not CPU or IO bound, try increasing `concurrent_writes`.	Low	Node	Hourly	Alert
ReplicateOnWriteStage Stress	Be careful when using CL.ONE counter increments because it has an async task, which involves a read, kicked off to run after the increment is completed. Too many processes in this pool will begin to block writes.	Medium	Node	Hourly	Info
ReplicateOnWriteStage Stress	Reduce the use of CL.ONE counter increments or upgrade to Cassandra 2.1 or higher.	Medium	Node	Hourly	Info

Replication Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Replication factor out of bounds	Checks that your cluster does not have a replication factor higher than it can support.	Info	Cluster	Daily	Info
Replication factor out of bounds	Lists keyspaces that have a total RF higher than the number of nodes. Please update the replication factor for the appropriate keyspaces, or add additional nodes to your cluster.	Info	Cluster	Daily	Info
SimpleSnitch usage found	Checks to make sure SimpleSnitch isn't used in production.	Medium	Node	Daily	Info
SimpleSnitch usage found	SimpleSnitch is not recommended for production clusters because it does not recognize datacenter or rack information. Please update the snitch to a topology-enabled snitch.	Medium	Node	Daily	Info
SimpleStrategy keyspace usage found	Checks that you are not using SimpleStrategy for any keyspaces in a multi-datacenter environment.	Medium	Cluster	Daily	Alert
SimpleStrategy keyspace usage found	Please update the replication strategies of the relevant keyspace(s) to use NetworkTopologyStrategy.	Medium	Cluster	Daily	Alert

Search Advisor

Advice for Solr search nodes. For more information, see DSE Search.


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Vnodes enabled on Search nodes	Checks that there are either 16 or 32 vnodes on DataStax Enterprise search nodes.	High	Node	Daily	Alert
Vnodes enabled on Search nodes	Replace the current search nodes that have vnodes enabled with nodes with the correct number of vnodes.	High	Node	Daily	Alert
Search nodes enabled with bad autocommit	Checks to see if a running Solr node has autocommit within 5-10 seconds.	Medium	Cluster	Daily	Alert
Search nodes enabled with bad autocommit	Please modify your autocommit threshold to within 5-10 seconds.	Medium	Cluster	Daily	Alert
Search nodes enabled with query result cache	Checks to see if a running Solr node has query result cache disabled.	Medium	Cluster	Daily	Alert
Search nodes enabled with query result cache	Please modify your Solr config query to disable the queryResultCache.	Medium	Cluster	Daily	Alert
Search nodes with bad filter cache size	Checks to see if filter cache size is optimized for a running Solr node.	Medium	Cluster	Daily	Alert
Search nodes with bad filter cache size	Please modify your filter cache `size` attribute to 128 if using solr.LRUCache. Otherwise, if using solr.search.SolrFilterCache, modify the `highWaterMarkMB` attribute to 256.	Medium	Cluster	Daily	Alert
Search nodes enabled with row cache	Checks to see if a Solr node has row cache enabled.	Medium	Node	Daily	Alert
Search nodes enabled with row cache	For optimizing memory use for DSE search with Solr, the row cache should be disabled. Edit the cassandra.yaml file and disable the row cache. Tip: If using LCM, adjust the value in the Caches pane of cassandra.yaml in the appropriate LCM Config Profiles and run a configure job.	Medium	Node	Daily	Alert
Search nodes have default key cache size	Checks to see if a Solr node has key cache set to default size.	Medium	Node	Daily	Alert
Search nodes have default key cache size	For optimizing memory use for DSE search with Solr, the key cache size should be set to its default size. Edit the cassandra.yaml file and ensure the key cache size is set to the recommended default size. Tip: If using LCM, adjust the value in the Caches pane of cassandra.yaml in the appropriate LCM Config Profiles and run a configure job.	Medium	Node	Daily	Alert
Search nodes have improper heap size	Checks to see if a Solr node has enough heap space.	Medium	Node	Daily	Alert
Search nodes have improper heap size	For optimizing memory use for DSE search with Solr, the heap should be set to at least 14GB. Set the Solr node max heap to at least 14GB.	Medium	Node	Daily	Alert

Security Advisor


Rule	Description/Recommendation	Importance	Scope	Interval (default)	Alert Level
Security keyspace not properly replicated	Checks that the auth keyspace is replicated correctly when using PasswordAuthenticator.	High	Node, Cluster	Daily	Alert
Security keyspace not properly replicated	Please increase the replication of the `system_auth` keyspace.	High	Node, Cluster	Daily	Alert
Security superuser has default setting	Checks that the default cassandra superuser and password has been changed from the default.	High	Cluster	Daily	Alert
Security superuser has default setting	Security superuser has default setting. Please update the password for the user 'cassandra'. Tip: Change the default password for the cassandra user in the Edit Cluster dialog of LCM for OpsCenter versions 6.5 and later.	High	Cluster	Daily	Alert
Improper Security authentication setting	Checks that the cassandra authentication is enabled and not set to AllowAllAuthenticator.	Medium	Node	Daily	Alert
Improper Security authentication setting	AllowAllAuthenticator performs no security checks and is not recommended. Please update cassandra.yaml on your nodes and change authenticator from org.apache.cassandra.auth.AllowAllAuthenticator to org.apache.cassandra.auth.PasswordAuthenticator. Tip: Change the authenticator in the Security pane of casssandra.yaml in the appropriate LCM Config Profiles.	Medium	Node	Daily	Alert
Incorrect OpsCenter authentication setting	Checks that the OpsCenter authentication is not set to the default if you are using DatastaxEnterpriseAuth.	High	Cluster	Daily	Alert
Incorrect OpsCenter authentication setting	Please change the default password of the admin user for OpsCenter authentication.	High	Cluster	Daily	Alert
Sensitive Config Value Encryption	It is recommended to enable encryption of sensitive config values in cassandra.yaml.	Medium	Node	Daily	Info
Sensitive Config Value Encryption	Config value encryption is not enabled. The rule failed on the following nodes: `listed failed nodes`. In dse.yaml, set `config_encryption_active` to true and use dsetool encryptconfigvalue to create encrypted config values for the sensitive fields. For more information, see config_encryption_active and Transparent data encryption. Tip: If using LCM, adjust the dse.yaml in the Encryption settings pane of the appropriate LCM Config Profiles.	Medium	Node	Daily	Info