Tests to ensure that your cluster's memory, CPU, disks, number of nodes, and network settings meet your business needs.
To ensure that your selections of memory, CPU, disks, number of nodes, and network settings meet your business needs, test your cluster prior to production. By simulating your production workload, you can avoid unpleasant surprises and ensure successful production deployment.
Your testing workload should match the production workload as closely as possible. Variances from production in testing must fully account for any differences between workloads.
- Considerations if you're coming from a relational database world
- Service level agreements targets
- Establish TPS targets
- Monitor the important metrics
- Simulate real outages
- Simulate production operations
- Run a real replication factor on at least 5 nodes
- Simulate the production datacenter setup
- Simulate the network topology
- Simulate expected per node density
- Use the cassandra-stress tool for testing
Cassandra-based applications and clusters are much different than relational databases and use a data model based on the types of queries, not on modeling entities and relationships. Besides learning a new database technology and API, writing an application that scales on a distributed database instead of a single machine requires a new way of thinking.
A common strategy new users employ for making the transition from single machine to distributed node is to make heavy use of materialized views, secondary indexes, and UDF (user defined functions). It’s important to realize that in most applications at large scale these techniques are not heavily used, and if they are, allowances must be made for the performance penalty in node sizing and SLAs.
Before going into production, decide what constitutes fast and slow Service level agreements (SLA), and how often slow SLAs are allowed to occur.
|100||Not recommended: The worst measurement is often irrelevant to testing and based on factors external to Cassandra.|
|99.9||Aggressive: Use only for latency sensitive use cases. Be aware that hardware expenses will be high if you need to keep this close to average.|
|99||Sweet spot and the most common.|
|95||Loose: Use when where latency is not easily noticed this is a good metric to measure.|
The following metrics are based on clusters with more data on disk than RAM. Measurement is in the 99th percentile from nodetool cfhistograms (Cassandra 2.1) or nodetool tablehistograms (Cassandra 3.0, 3.x).
|Top-of-the-line high CPU nodes with NVM flash storage||< 10ms|
|Common SATA based SSDs||< 150ms
Factors, such as testing, controller, and contention, can shift this measurement significantly. This can result in some hardware approaching NVM specifications.
|Common SATA based-systems||500ms is common.
This metric varies significantly due to controller, contention, data model, and so on. Tuning can shift this figure an order of magnitude.
Determine how many transactions per second (TPS) you expect your cluster to handle. Ideally, match this number with the same node count in your test cluster. Alternately, test with ratio of TPS to nodes to evaluate how the cluster behaves.
- Does the cluster meet SLAs below that level?
- At what TPS level do your SLAs fail?
OpsCenter to monitor cluster metrics. You can also use JMX MBeans or Linux
dstat commands. Monitor the following metrics:
- System hints
- Generated system hints should not be continually high. If a node is down or a datacenter link is down, expect this metric to accumulate (by design).
- Pending compactions
- The number of acceptable pending compactions depends on the type of compaction
Compaction strategy Pending compactions SizedTieredCompactionStrategy (STCS) Should not exceed 100 for more than short periods. LeveledCompactionStrategy (LCS) Not an accurate measure for pending compactions. Generally, if it stays over 100 is indication of stress. Over 1000 is pathological. TimeWindowCompactionsStrategy (TWCS) Contact the DataStax Services team.
- Pending writes
- Should not go over 10,000 writes for more than short periods. Otherwise, a tuning or sizing problem exists.
- heap usage
- If huge spikes with long pauses exist, adjust the heap.
- I/O wait
- Metrics greater than 5 indicates that the CPU is waiting too long on I/O and is a sign of being I/O bound.
- CPU Usage
- For CPUs using hyper-threading, it is helpful to remember that for each processor core that is physically present, the operating system addresses two virtual (logical) cores and shares the workload between them. This means that if half your threads are at 100% CPU usage, the CPU capacity is at maximum even if the tool reports 50% usage. To see the actual thread usage, use the ttop command.
- Monitor your SLAs
- Monitor according to the targets you set above.
- SSTable counts
- If SSTable counts go over 10,000, see CASSANDRA-11831.
Compaction strategy SSTable counts SizedTieredCompactionStrategy (STCS) Should not get too far over 100. LeveledCompactionStrategy (LCS) Make sure that Level 0 does not get too far over 100. TimeWindowCompactionsStrategy (TWCS) Contact the DataStax Services team.
- Log warnings and errors
- Look for the following:
- GCInspector (Cassandra 2.1–3.0 versions using the G1GC are verbose.)
Simulating outages is key to ensuring that your cluster functions well in production. To simulate outages:
- Take one or two nodes offline (more is preferable, depending on the cluster size).
- Do your queries start failing?
- Do the nodes start working without any change when they come back online?
- Take a datacenter offline while taking load.
How long can it be offline before system.hints build up to unmanageable levels?
- Run an expensive task that eats up CPU on a couple of nodes, such as find "a /
This scenario helps you prepare for the effect on your nodes in case this happens.
- Do expensive range queries in CQLSH while taking load to see how the cluster behaves. For
PAGING OFF; SELECT * FROM FOO LIMIT 100000000;
This test ensures your cluster is ready for naive or unintended actions by users.
Do the following during load testing and repeat under outages as described above:
- Run repair to ensure that you have enough overhead to complete the repair.
- Bootstrap a new node into the cluster.
- Add a new datacenter. This exercise gives you an idea of the process when you need to do it in production.
- Change the schema by adding new tables, removing tables, and altering existing tables.
- Replace a failed node. Note how long it takes to bootstrap a new node.
Ideally run the same number of nodes that you expect to run in production. Moreover, if you’re planning on running only 3 nodes in production, test with 5. You may find that the queries that were fast with 3 nodes are much slower with 5 nodes. In this scenario, secondary indexes as a common query path are exposed as being slower at scale.
Be sure to use the same replication factor as in production.
Each datacenter adds additional write costs. If your production cluster has more datacenters than your test cluster, your writes might be greater than you anticipate. For example, if you use RF 3 and test your cluster with 2 datacenters, each write goes to 4 nodes in 2 data centers. If your production cluster uses the same RF and has 5 datacenters, each write goes to 7 nodes with 5 data centers. This means writes are now almost twice as expensive on the coordinator node because in each remote datacenter, the forwarding coordinator sends the data to the other replicas in it’s datacenter.
Simulate your network topology with the same latency and pipe. Test with the expected TPS. Be aware that remote links can be a limiting factor for many use cases and expected bandwidth.
- What happens to your SLAs as the target density is approached?
- What happens if you go over your the density?
- How long does bootstrap take?
- How long does with repair take?