Capacity planning and hardware selection for HCD deployments
General guidelines
Use the following guidelines when choosing hardware for Hyper-Converged Database (HCD) deployments:
-
Hardware choices depend on your particular use case. For example, the optimal balance of CPUs, memory, disks, network resources, and number of nodes for an environment with static data that are accessed infrequently will be vastly different than an environment with volatile data that is accessed frequently.
-
The suggested guidelines are the minimum required. You may need to increase CPU capacity, memory, and disk space beyond the recommended minimums.
-
Thoroughly test your configuration before deployment.
In addition, for optimal performance in Linux environments, DataStax recommends using the most recent version of the Linux operating system. Newer versions of Linux handle highly concurrent workloads more efficiently. |
CPUs and cores
In HCD, insert-heavy workloads are CPU-bound before becoming memory-bound. All writes go to the commit log, but the database is so efficient in writing that the CPU is the limiting factor. HCD is highly concurrent and uses as many vCPUs as available.
Recommended minimum:
Minimum cores | Remarks |
---|---|
Production |
|
8 cores (vCPUs) |
Use cases vary. More may be needed for higher request throughput, indexing, or higher node density. |
Development |
|
2 cores (vCPUs) |
Sufficient for non-load testing environments. |
HCD is horizontally scalable. In some scenarios, adding more database nodes maybe better for scaling production clusters. |
Memory and heap
The more memory an HCD node has, the better read performance. More RAM also allows memory tables (memtables) to hold more recently written data. Larger memtables lead to a fewer number of SSTables being flushed to disk, more data held in the operating system page cache, and fewer files to scan from disk during a read. The ideal amount of RAM depends on the anticipated size of your hot data.
Recommended minimum:
Node type | System memory | Heap |
---|---|---|
Production |
||
Transactional Apache Cassandra® |
32 GB |
8 GB |
32 GB to 512 GB |
24 GB (RAM < 64 GB) |
|
Development |
||
Non-load testing environments |
8 GB to 16 GB |
4 GB to 8 GB |
Storage subsystem
The storage subsystem is critical to the performance of your database. The database writes data to disk when appending data to the commit log for durability and when flushing memtables to SSTable data files for persistent storage.
Maximum capacity per node (node density)
DataStax recommends up to 2 TB of data per node with appropriate free disk space for the compaction strategy being used. |
Exceeding the recommended node density may have the following effects:
-
Compactions may get behind depending on write throughput, hardware, and compaction strategy.
-
Streaming operations such as bootstrap, repair, and replacing nodes will take longer to complete.
Higher capacity nodes work best with low to moderate write throughput and no indexing.
One special case is for time-series data.
Use the TimeWindowCompactionStrategy
to scale larger than these limits if:
-
The time-series data is written once and never updated.
-
The data has a clustering column that is time based.
-
Read queries cover specific time-bounded ranges of data rather than its full history.
-
The TWCS windows are configured appropriately.
Vector search capacity
Vector search is a way to do semantic associations among data as an extension to storage attached indexes (SAI).
From an operational standpoint, vector search behaves like any other database index. Writing data requires additional CPU resources to index it. When reading data, vector search and SAI will require additional work to consult the indexes, gather results, and send them to the application client. Therefore, capacity planning needs to consider this overhead when it comes to CPU usage, available memory, speed of storage, and per-node data density. Specifically, we recommend nodes that use Vector Search have at least 16 vCPUs, at least 64 GB memory, and fast storage—for example, SSD or NVMe-based.
Thoroughly test before deploying to production. DataStax highly recommends testing with tools such as NoSQLBench with your desired configuration. Be sure to test common administrative operations, such as bootstrap, repair, and failure, to make certain your hardware selections meet your business needs. For more information, see Testing Your Cluster Before Production. |
In addition to the database cluster, DataStax optionally provides a Data API that can abstract Vector Search data and indexes behind an easy to use JSON collection oriented interface. The Data API lives in a separate stateless service that is deployed as containers. The Data API can be scaled separately from the cluster and will depend on request throughput.
Per-node data density factors
Node density is highly dependent on the environment. Determining node density depends on many factors, including:
-
Data frequency change and access frequency.
-
Service-Level Agreements (SLAs) and ability to handle outages.
-
Data compression.
-
Compaction strategy: These strategies control which SSTables are chosen for compaction and how the compacted rows are sorted into new SSTables. Each strategy has its own strengths. To determine your compaction strategy, see compaction strategy.
-
Network performance: remote links likely limit storage bandwidth and increase latency.
-
Replication factor: See About data distribution and replication.
Free disk space
Disk space depends on usage, so it is important to understand the mechanism.
The database writes data to disk when appending data to the commit log for durability and when flushing memtables to SSTable data files for persistent storage. The commit log has a different access pattern (read/writes ratio) than the pattern for reading data from SSTables. This is more important for spinning disks than for SSDs.
SSTables are periodically compacted. Compaction improves performance by merging and rewriting data and discarding old data. However, depending on the type of compaction and size of the compactions, disk utilization and data directory volume temporarily increase during compaction. For this reason, be sure to leave an adequate amount of free disk space available on a node.
The following table provides guidelines for the minimum disk space requirements based on the compaction strategy:
Compaction strategy | Minimum requirements |
---|---|
|
Worst case: 20% of free disk space at the default configuration. |
|
Sum of all the SSTables compacting must be smaller than the remaining disk space. Worst case: 50% of free disk space. This scenario can occur in a manual compaction where all SSTables are merged into one giant SSTable. |
|
Generally 10%. Worse case: 50% when the Level 0 backlog exceeds 32 SSTables (LCS uses STCS for Level 0). |
|
TWCS requirements are similar to STCS. TWCS requires a maximum disk space overhead of approximately 50% of the total size of SSTables in the last created bucket. To ensure adequate disk space, determine the size of the largest bucket or window ever generated and calculate 50% of that size. |
Cassandra periodically merges SSTables and discards old data to keep the database healthy through a process called compaction. For more information, see How is data maintained?.
Estimate usable disk capacity
To estimate how much data your nodes can hold, calculate the usable disk capacity per node and then multiply that by the number of nodes in your cluster.
-
Start with the raw capacity of the physical disks:
raw_capacity = disk_size * number_of_data_disks
-
Calculate the usable disk space accounting for file system formatting overhead (roughly 10 percent):
formatted_disk_space = (raw_capacity * 0.9)
-
Calculate the recommended working disk capacity:
usable_disk_space = formatted_disk_space * (0.5 to 0.8)
During normal operations, the database routinely requires disk capacity for compaction and repair operations. For optimal performance and cluster health, DataStax recommends not filling your disks, but running at 50% to 80% capacity. |
Capacity and I/O
When choosing disks for your nodes, consider both capacity (how much data you plan to store) and I/O (the write/read throughput rate).
Some workloads are best served by using less expensive SATA disks and scaling disk capacity and I/O by adding more nodes (with more RAM).
Number of disks - HDD
DataStax recommends using at least two disks per node: one for the commit log and the other for the data directories.
At a minimum, the commit log should be on its own partition.
Commit log disk - HDD
The disk need not be large, but it should be fast enough to receive all of your writes as appends (sequential I/O).
Commit log disk - SSD
Unlike spinning disks, there is less of a penalty for sharing commit logs and data directories on SSD than there is on HDD.
DataStax recommends separating commit logs and data for highest performance and resiliency.
Data disks
Use one or more disks per node and make sure they are large enough for the data volume and fast enough to satisfy both (1) reads that are not cached in memory, and (2) to keep up with compaction.
RAID on data disks
It is generally not necessary to use a Redundant Array of Independent Disks (RAID) for the following reasons:
-
Data is replicated across the cluster based on the replication factor you have chosen.
-
HCD includes a JBOD (just-a-bunch-of-disks) feature for disk management. Because the database responds according to your availability/consistency requirements to a disk failure either by stopping the affected node or by denylisting the failed drive, you can deploy nodes with large disk arrays without the overhead of RAID 10. You can configure the database to stop the affected node or denylist the drive according to your availability/consistency requirements. Also see Recovering from a single disk failure using JBOD.
RAID on the commit log disk
Generally RAID
is not needed for the commit log disk.
Replication adequately prevents data loss.
If you need extra redundancy, use RAID 1
.
Extended file systems
DataStax recommends deploying on XFS
or ext4
.
On ext2
or ext3
, the maximum file size is 2TB
even using a 64-bit kernel.
Because the database can use almost half your disk space for a single file when using SizeTieredCompactionStrategy
(STCS), use XFS when using large disks, particularly if using a 32-bit kernel.
XFS file size limits are 16TB
max on a 32-bit kernel, and essentially unlimited on 64-bit.
Partition size and count
For efficient operation, partitions must be sized within certain limits.
Two measures of partition size are the number of values in a partition and the partition size on disk. The practical limit of cells per partition is 2 billion.
Sizing the disk space is more complex, and involves the number of rows and the number of columns, primary key columns and static columns in each table. Each application has different efficiency parameters, but a good rule-of-thumb is to keep the number of rows per partition below 100,000 items and the partition size under 100 MB.
Network bandwidth
Minimum recommended bandwidth: 1000 Mb/s
(gigabit/s).
A distributed data store puts load on the network to handle read/write requests and replication of data across nodes. Be sure that your network can handle inter-node traffic without bottlenecks.
DataStax recommends binding your interfaces to separate Network Interface Cards (NIC). You can use public or private NICs depending on your requirements.
The database efficiently routes requests to replicas that are geographically closest to the coordinator node and chooses a replica in the same rack when possible. The database always chooses replicas located in the same datacenter over replicas in a remote datacenter.
Firewall configuration for clusters
If using a firewall, make sure that nodes within a cluster can communicate with each other.
The following ports must be open to allow bi-directional communication between nodes. Configure the firewall running on nodes in your cluster accordingly. Without the following open ports, nodes act as a standalone database server and do not join the cluster.
Port number | Description |
---|---|
Public Port |
|
22 |
SSH port |
Inter-node Ports |
|
7000 |
Inter-node cluster communication |
7001 |
SSL inter-node cluster communication |
7199 |
JMX monitoring port |
Client Ports |
|
9042 |
Client port |
9142 |
Default for |