Estimating usable disk capacity 

Determining how much data your DataStax or Cassandra nodes can hold.

To estimate how much data your DataStax or Apache Cassandra™ nodes can hold, calculate the usable disk capacity per node and then multiply that by the number of nodes in your cluster. Typically in a production cluster, the commit log and data directories are on different disks.

Procedure

  1. Start with the raw capacity of the physical disks:
    raw_capacity = disk_size * number_of_data_disks
  2. Calculate the usable disk space accounting for file system formatting overhead (roughly 10 percent):
    formatted_disk_space = (raw_capacity * 0.9)
  3. Calculate the recommended working disk capacity:

    During normal operations, Cassandra routinely requires disk capacity for compaction and repair operations. For optimal performance and cluster health, DataStax recommends not filling your disks to capacity, but running at 50% to 80% capacity depending on the compaction strategy and size of the compactions.

    usable_disk_space = formatted_disk_space * (0.5 to 0.8)