Changing heap size parameters

By default, DataStax Enterprise (DSE) sets the Java Virtual Machine (JVM) heap size from 1 to 32 GB depending on the amount of RAM and type of Java installed. The cassandra-env.sh automatically configures the min and max size to the same value using the following formula:

max(min(1/2 ram, 1024 megabytes), min(1/4 ram, 32765 megabytes))

To adjust the JVM heap size, uncomment and set the following parameters in the jvm.options file:

  • Minimum (-Xms)

  • Maximum (-Xmx)

  • New generation (-Xmn)

  • Parallel processing for GC (-XX:+UseParallelGC)

Restriction: When overriding the default setting, both min and max must be defined the jvm.options file.

Additionally, for larger machines, increase the max direct memory (-XX:MaxDirectMemorySize), but leave around 15-20% of memory for the OS and other in-memory structures.

Where is the cassandra-env.sh file?

The location of the cassandra-env.sh file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-env.sh

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra-env.sh

Where is the jvm.options file?

The location of the jvm.options file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/jvm.options

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/jvm.options

Guidelines and recommendations

Setting the Java heap higher than 32 GB may interfere with the OS page cache. Operating systems that maintain the OS page cache for frequently accessed data are very good at keeping this data in memory. Properly tuning the OS page cache usually results in better performance than increasing the row cache.

For production use, follow these guidelines to adjust heap size for your environment:

  • Heap size is usually between ¼ and ½ of system memory but not larger than 32 GB.

  • Reserve enough memory for the offheap cache and file system cache.

  • Enable GC logging when adjusting GC.

  • Gradually increase or decrease the parameters. Test each incremental change.

  • Enable parallel processing for GC, particularly when using DSE Search.

  • The GCInspector class logs information about any garbage collection that takes longer than 200 ms. Garbage collections that occur frequently and take a moderate length of time (seconds) to complete indicate excessive garbage collection pressure on the JVM. In addition to adjusting the garbage collection options, other remedies include adding nodes, and lowering cache sizes.

  • For a node using G1, DataStax recommends a MAX_HEAP_SIZE as large as possible, up to 64 GB.

Maximum and minimum heap size

The recommended maximum heap size depends on which GC is used:

Hardware setup
Hardware setup Recommended MAX_HEAP_SIZE

G1 for newer computers (8+ cores) with up to 256 GB RAM

16 GB to 32 GB. See Java performance tuning.

CMS for newer computers (8+ cores) with up to 256 GB RAM

No more than 16 GB

Older computers

Typically 8 GB

New heap size

For CMS, you may also need to adjust new (young) generation heap size. This setting determines the amount of heap memory allocated to newer objects. The database calculates the default value for this property in megabytes (MB) as the lesser of:

  • 100 times the number of cores

  • ¼ of MAX_HEAP_SIZE

Procedure

  1. To enable GC logging, uncomment the loggc parameter in the jvm.options file.

    -Xloggc:/var/log/cassandra/gc.log

    After restarting Cassandra the log is created and GC events are recorded.

  2. Set the heap sizes in the jvm.options file:

    1. Uncomment and set both the min and max heap size. For example to set both the min and max heap size to 16 GB:

      -Xms16G
      -Xmx16G

      Set the min (-Xms) and max (-Xmx) heap sizes to the same value to avoid stop-the-world GC pauses during resize, and to lock the heap in memory on startup which prevents any of it from being swapped out.

    2. If using CMS, uncomment and set the new generation heap size to tune the heap for CMS. As a starting point, set the new parameter to 100 MB per physical CPU core. For example, for a modern eight-core or greater system:

      -Xmn800M

      A larger size leads to longer GC pause times. For a smaller new size, GC pauses are shorter but usually more expensive.

  3. On larger machines, increase the max direct memory (-XX:MaxDirectMemorySize), but leave around 15-20% of memory for the OS and other in-memory structures. For example, to set the max direct memory to 1 MB:

    -XX:MaxDirectMemorySize=1M

    By default, the size is zero, so the JVM selects the size of the NIO direct-buffer allocations automatically.

    Alternatively, you can set an environment variable called MAX_DIRECT_MEM, instead of setting a size for -XX:MaxDirectMemorySize in the jvm.options file.

  4. Save and close the jvm.options file.

  5. Restart Cassandra and run some read heavy or write heavy operations.

  6. Check the GC logs.

    This method decreases performance for the test node, but generally does not significantly reduce cluster performance.

    If performance does not improve, contact the DataStax Services team for additional help.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com