Upgrade Apache Cassandra® to DataStax Enterprise

The upgrade process from open-source Apache Cassandra® to DataStax Enterprise (DSE) requires that you upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier platform until all of the nodes in the cluster are upgraded.

DataStax strongly recommends using the Zero-Downtime Migration (ZDM) tools for the lowest risk and least possible downtime when migrating from Cassandra to DSE. This approach permits a wider range of upgrade paths without the need for interim upgrades, and it provides a seamless rollback strategy.

If you intend to perform an in-place upgrade, carefully review the upgrade planning guide and all upgrade instructions before you begin the upgrade to reduce the chance of errors and data loss.

For assistance with migrations from Cassandra to DSE, contact DataStax Support.

Upgrade paths

Upgrades are dependent upon your current Cassandra version and your target DSE version. The greater the gap between the current version and the target version, the more complex the upgrade. Upgrades from earlier versions can require one or more interim upgrades.

Current Cassandra version DSE upgrade path

Cassandra versions after 3.11, including 4.x and 5.x

In-place upgrades are riskier or more complex due to differences between open-source Cassandra and the version of Cassandra in DSE. Instead, DataStax recommends using the ZDM tools to migrate your data to a new, separate DSE cluster. This approach provides a seamless rollback strategy in case of data loss or corruption. For other options and more information, see Migrate to DataStax Enterprise.

Cassandra 3.0 or 3.11

If you want to perform an in-place upgrade on your existing clusters, you must upgrade to DSE 5.1 first, and then you can upgrade to DSE 6.8 or 6.9. Use the instructions in this guide to upgrade to 5.1, and then upgrade from 5.1 to 6.8 or 5.1 to 6.9.

Alternatively, you can use the ZDM tools to migrate your data to a new, separate DSE cluster without the need for an interim upgrade.

Cassandra 2.1 to the end of the 2.x series

For in-place upgrades, you cannot upgrade directly to DSE 5.1, 6.8, or 6.9 from Cassandra versions earlier than 3.0. Upgrade to Cassandra 3.0 (minimum), and then follow the upgrade path for Cassandra 3.0 or 3.11.

Alternatively, if you are running Cassandra version 2.1.6 or later, you can use the ZDM tools to migrate your data to a new, separate DSE cluster without the need for an interim upgrade.

Cassandra 2.0 and earlier

You cannot upgrade directly to DSE 5.1, 6.8, or 6.9 from Cassandra version 2.0 or earlier. If you are on Cassandra 2.0 or earlier, you must upgrade to Cassandra 2.1, then upgrade to Cassandra 3.0 (minimum), and then you can follow the upgrade path for Cassandra 3.0 or 3.11.

If you want to avoid multiple interim upgrades, upgrade to Cassandra 2.1.6, and then use the ZDM tools to migrate your data to a new, separate DSE cluster without the need for an additional interim upgrades.

Back up your existing installation

DataStax recommends backing up your data prior to any version upgrade.

A backup provides the ability to revert and restore all the data used in the previous version if necessary.

You can use the same process to back up Cassandra as you would for DSE, changing directory names and DSE-specific commands as needed. For instructions, see Backing up a tarball installation or Backing up a package installation.

Upgrade restrictions and limitations

Restrictions and limitations apply while a cluster is in a partially upgraded state. This means that some, but not all, nodes in the cluster have been upgraded. The cluster continues to work as though it were on the earlier platform until all of the nodes in the cluster are upgraded. For this reason, you must avoid certain operations until the upgrade is complete on all nodes.

Nodes on different versions might show a schema disagreement during an upgrade. This is normal.

General restrictions

  • Don’t enable new features.

  • Don’t run nodetool repair.

  • Disable all automated repair processes.

  • During the upgrade, don’t bootstrap new nodes or decommission existing nodes.

  • Don’t enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes to DSE 5.1 or later before enabling CDC.

  • Don’t issue TRUNCATE or DDL related queries during the upgrade process.

  • Don’t alter schemas for any workloads. Propagation of schema changes between mixed-version nodes can have unexpected results. Take action to prevent schema changes from occurring during the upgrade process.

Upgrade time limit

Once you upgrade one node in a cluster, you must complete the cluster-wide upgrade before the expiration of gc_grace_seconds (default 10 days) to ensure any repairs complete successfully.

Use storage port 7000 for online upgrades

Online upgrades require the default storage port 7000. A cluster that uses non-default storage_port values must use the ZDM tools to upgrade to DSE.

Verify your storage port configuration before you begin the upgrade process.

Restrictions for nodes using security

  • Don’t change security credentials or permissions until the upgrade is complete on all nodes.

  • If you aren’t already using Kerberos, don’t set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.

If you plan to upgrade to DSE 6.9.7 or later, you will need to modify the upgrade process if your cluster uses legacy legacy internode encryption (deprecated in Cassandra 4.0), including transitional mode to permit an internode encryption-based cluster to interact with unencrypted nodes. In Cassandra 4.0 and DSE 6.9.7 and later, ssl_storage_port is deprecated and the storage_port permits encrypted, unencrypted, and mixed encryption node-to-node communication.

To enable an legacy-encrypted cluster to continue to function during an upgrade to DSE 6.9.7 or later, do the following after upgrading your nodes to DSE 5.1:

  1. Edit the cassandra.yaml file.

  2. In the server_encryption_options section, set the legacy_ssl_storage_port_enabled option to true. This configuration enables listening on the deprecated ssl_storage_port.

  3. When you complete the upgrade for the cluster, set the legacy_ssl_storage_port_enabled option to false. This configuration disables listening on the deprecated ssl_storage_port.

Application code and driver compatibility

Check driver compatibility to ensure that your driver version supports both your Cassandra version and DSE 5.1 (minimum).

If your target DSE version is later than 5.1, select a version that supports Cassandra, DSE 5.1, and your target DSE version. If no such version exists, you will need to upgrade your driver version again after you upgrade your clusters to DSE 5.1.

If you need to upgrade your driver, be sure to check the driver documentation for any code changes that might be required between your original and new driver versions. Depending on the driver version, you might need to recompile your client application code.

During upgrades, you might experience driver-specific issues when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host to which the driver connects, although certain drivers automatically select a protocol version that works across nodes. To avoid driver version incompatibility during upgrades, use one of the following workarounds:

  • Set the protocol version explicitly in your application at startup. Switch the driver to the new protocol version only after fully upgrading all nodes in the cluster.

  • Ensure that the list of initial contact points contains only hosts with the oldest database version or protocol version. For example, the initial contact points contain only protocol version 2.

For details on protocol version negotiation, see the documentation for your driver.

Prepare to upgrade

Follow these steps to prepare each Cassandra node for the upgrade:

  1. Familiarize yourself with the changes and features in your target version of DSE:

  2. Review the general upgrade advice and Cassandra features in NEWS.txt. If you are upgrading from an earlier version, read NEWS.txt from the latest version back to your current version.

  3. Ensure that your version of Cassandra is compatible with the version of Cassandra that is in DSE. See the Cassandra changes in CHANGES.txt and the upgrade paths.

  4. Before upgrading, be sure that each node has adequate free disk space.

    Determine current data disk space usage:

    sudo du -sh /var/lib/cassandra/data/
    Result
    3.9G	/var/lib/cassandra/data/

    Determine available disk space:

    sudo df -hT /
    Result
    Filesystem     Type  Size  Used Avail Use% Mounted on
    /dev/sda1      ext4   59G   16G   41G  28% /

    The required space depends on the compaction strategy. See Disk space.

  5. Upgrade the SSTables on each node to ensure that all SSTables are on the current version.

    You must upgrade SSTables on your nodes before and after upgrading. Failure to upgrade SSTables will result in severe performance degradation, increased disk usage, and possible data loss.

    nodetool upgradesstables

    You can use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time or, when using racks, one rack at a time.

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

  6. Verify the Java runtime version and upgrade to a supported version if needed:

    java -version
    Result
    openjdk version "1.8.0_222"
    OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
    OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)

    For DSE 5.1 and 6.8, OpenJDK 8 (1.8.0_151 minimum) and Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum) are supported. OpenJDK is recommended because DataStax does more extensive testing on OpenJDK than Oracle Java.

    If you plan to continue the upgrade to DSE 6.9, you might be aware that DSE 6.9 requires Java 11. After you upgrade to DSE 5.1, you will upgrade to Java 11 as part of your upgrade to 6.9.

  7. Run nodetool repair to ensure that data on each replica is consistent with data on other nodes:

    nodetool repair -pr
  8. Install the libaio package for optimal performance.

    • RHEL

    • Debian

    sudo yum install libaio
    sudo apt-get install libaio1
  9. Back up any customized configuration files since they can be overwritten with default values during installation of the new version.

    If you backed up your installation using the instructions in Backing up a tarball installation or Backing up a package installation, your original configuration files are included in the archive.

Upgrade steps

The upgrade process requires upgrading and restarting one node at a time in the following order:

  1. If using racks, upgrade node-by-node within one rack.

  2. Upgrade rack-by-rack within one datacenter, and upgrade seed nodes in a datacenter before non-seed nodes.

  3. Upgrade datacenter-by-datacenter within one cluster.

  4. Repeat to upgrade the next cluster until you have upgraded all nodes (by rack and datacenter) in all clusters.

Follow these steps for each node’s upgrade to DSE 5.1. The configuration changes in these steps are performed in the upgraded version, and they use DSE 5.1 documentation if version-specific documentation is necessary.

  1. Flush the commit log of the current installation:

    nodetool drain
  2. Stop the node.

  3. Uninstall Cassandra.

    If you installed Cassandra from packages in APT or RPM repositories, you must remove the packages before setting up and installing DSE.

    • APT package installations

    • RPM package installations

    • Tarball installations

    For packages installed from APT repositories, run the following command:

    sudo apt-get autoremove "dsc*" "cassandra*" "apache-cassandra*"

    This action shuts down Cassandra if it is still running before uninstalling it.

    For packages installed from Yum repositories, run the following command:

    sudo yum remove "dsc*" "cassandra*" "apache-cassandra*"

    It is normal for the old Cassandra configuration file to be renamed to cassandra.yaml.rpmsave. For example:

    warning: /etc/cassandra/default.conf/cassandra.yaml
    saved as /etc/cassandra/default.conf/cassandra.yaml.rpmsave

    If you installed Cassandra with a binary tarball, run the following commands, and then remove the Cassandra installation directory:

    ps auwx | grep cassandra
    sudo kill cassandra_pid
  4. Install DSE 5.1 using the same installation method (package or tarball) that you used for Cassandra.

  5. After upgrading but before restarting a node, compare changes in the new configuration files with your backup configuration files, remove deprecated settings, and update any new settings if required.

    You must use the new configuration files that are generated from the upgrade installation. Copy individual parameters from your old configuration files into the new files. Don’t replace the newly-generated configuration files with the old files.

    You can use the DSE yaml_diff tool to compare backup YAML files with the upgraded YAML files:

    cd /usr/share/dse/tools/yamls
    ./yaml_diff path/to/yaml-file-old path/to/yaml-file-new
    Result
    ...
    CHANGES
    =========
    authenticator:
    - AllowAllAuthenticator
    + com.datastax.bdp.cassandra.auth.DseAuthenticator
    
    authorizer:
    - AllowAllAuthorizer
    + com.datastax.bdp.cassandra.auth.DseAuthorizer
    
    roles_validity_in_ms:
    - 2000
    + 120000
    ...
  6. If upgrading from Cassandra 3.11.2 or later, comment out the enable_materialized_views and enable_sasi_indexes parameters in cassandra.yaml if they exist.

    Where is the cassandra.yaml file?

    The location of the cassandra.yaml file depends on the type of installation:

    Installation Type Location

    Package installations + Installer-Services installations

    /etc/dse/cassandra/cassandra.yaml

    Tarball installations + Installer-No Services installations

    <installation_location>/resources/cassandra/conf/cassandra.yaml

  7. Start the node.

  8. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:

    1. Get the node’s datacenter name:

      nodetool status | grep "Datacenter"
      Result
      Datacenter: datacenter-name
    2. Verify that the node’s datacenter name matches the datacenter name for a keyspace:

      cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication"
      Result
      CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter-name': '3'};
  9. Review the logs for warnings, errors, and exceptions:

    grep -w 'WARNING\|ERROR\|exception' /var/log/cassandra/*.log

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

    Non-standard log locations are configured in dse-env.sh.

  10. Run nodetool repair:

    bin/nodetool repair -pr

    Throughout the upgrade process, make sure that you eventually run nodetool repair on each node in each upgraded datacenter.

  11. Repeat the upgrade process on each node in the cluster following the recommended upgrade order.

  12. After the entire cluster upgrade is complete, upgrade the SSTables on one node at a time or, when using racks, one rack at a time.

    You must upgrade SSTables on your nodes before and after upgrading. Failure to upgrade SSTables will result in severe performance degradation, increased disk usage, and possible data loss.

    The upgrade isn’t complete until the SSTables are upgraded.

    nodetool upgradesstables

    You can use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time, or when using racks, one rack at a time.

    You can run the upgradesstables command before all the nodes are upgraded as long as you run the command on only one node at a time, or, when using racks, one rack at a time. Running upgradesstables on too many nodes at once degrades performance.

Post-upgrade steps

Your clusters are now upgraded to DSE 5.1. To continue your upgrade to DSE 6.8 or 6.9, follow the upgrade guide for your target version:

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2025 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM