Upgrading from Apache Cassandra to DataStax Enterprise

Upgrade instructions from Apache Cassandra to DataStax Enterprise.

Attention: Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrade instructions can prevent errors and data loss.

DataStax driver changes

DataStax drivers come in two types:

  • DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
  • DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
Note: While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise features.

Upgrade order

Upgrade order matters. Upgrade nodes in this order:

  • In multiple datacenter clusters, upgrade every node in one datacenter before moving on to another datacenter.
  • Upgrade the seed nodes within a datacenter first.
Upgrade paths
Upgrades are impacted by the version you are upgrading from and the version you are upgrading to. The greater the gap between the current version and the target version, the more complex the upgrade. Upgrades from earlier versions may require an interim upgrade to a required version:
Upgrade from Apache Cassandra™ Upgrade to DataStax Enterprise Required interim version
Cassandra 3.0 and 3.11 DSE 6.7 Not required
Cassandra 3.0 and 3.11 DSE 6.0 Not required
Cassandra 3.0 and 3.11 DSE 5.1 Not required
Cassandra 3.0 DSE 5.0 Not required
Cassandra 2.1 DSE 5.0 DSE 4.8
Cassandra 2.0 and earlier Cassandra 2.1

Questions? Contact DataStax Support.

General upgrade restrictions durning an upgrade
  • Do not enable new features.
  • Do not run nodetool repair. If you have the OpsCenter Repair Service configured, turn off the Repair Service.
  • Ensure OpsCenter compatibility. See DataStax OpsCenter compatibility with DataStax Enterprise.
  • During the upgrade, do not bootstrap or decommission nodes.
  • Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
  • Do not enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes to DSE 5.1 or later before enabling CDC.
  • Failure to upgrade SSTables results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
Note: Nodes on different versions might show a schema disagreement during an upgrade.
  • NodeSync waits to start until all nodes are upgraded.
  • Do not enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes before enabling CDC.
  • Ensure OpsCenter compatibility. OpsCenter 6.7 is required for managing DSE 6.7 clusters. See DSE OpsCenter compatibility with DataStax Enterprise.
Restrictions for nodes using any kind of security
  • Do not change security credentials or permissions until after the upgrade is complete.
Upgrading drivers and possible impact when driver versions are incompatible
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host that the driver connects to. To avoid driver version incompatibility during upgrades, use one of these workarounds:
  • Protocol version: Because some drivers can use different protocol versions, force the protocol version at start up. For example, keep the Java driver at its current protocol version while the driver upgrade is happening. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
  • Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest driver version. For example, the initial contact points contain only Java driver v2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you're using, for example, Java driver.
Upgrade order
Upgrade order matters. In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
  1. Analytics datacenters: seed nodes first, and then the rest of the analytics nodes.
  2. Cassandra (transactional) nodes or datacenters
  3. Search nodes or datacenters
With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded. Upgrade and restart the nodes one at a time. Other nodes in the cluster continue to operate at the earlier version until all nodes are upgraded.

Procedure

Follow these steps on each node:

  1. Before upgrading to DataStax Enterprise from any Apache Cassandra™ version, DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary. See Backing up and restoring Cassandra data.
  2. Familiarize yourself with the changes and features in the release:
    • DataStax Enterprise release notes for 4.7, 4.8, 5.0, and 5.1.
    • General upgrade advice and Cassandra features in NEWS.txt/DSE CHANGES.txt. If you are upgrading from an earlier release, read NEWS.txt all the way back to your current version.
    • Ensure that your version of Cassandra can be upgraded directly to the version of Cassandra that is used by DataStax Enterprise. See the Cassandra changes in CHANGES.txt/DSE CHANGES.txt.
  3. Upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    nodetool upgradesstables
    Warning: Failure to upgrade SSTables results in a significant performance impact and increased disk usage.
  4. Run nodetool drain to flush the commit log of the old installation:
    nodetool -h hostname drain
    This step saves time when nodes start up after the upgrade, and prevents DSE Search nodes from having to reindex data.
  5. Stop the node. (2.1, 2.2, 3.0)
  6. Back up your configuration files.

    Back up your configuration to ensure that the configuration files are not overwritten with the default values.

  7. Uninstall Cassandra.
    If you installed Cassandra from packages in APT or RPM repositories, you must remove the packages before setting up and installing DataStax Enterprise from the appropriate repository.
    • For packages installed from APT repositories:
      sudo apt-get autoremove "dsc*" "cassandra*" "apache-cassandra*"

      This action shuts down Cassandra if it is still running.

    • For packages installed from Yum repositories:
      sudo yum remove "dsc*" "cassandra*" "apache-cassandra*"

      The old Cassandra configuration file might be renamed to cassandra.yaml.rpmsave, for example:

      warning: /etc/cassandra/default.conf/cassandra.yaml
      saved as /etc/cassandra/default.conf/cassandra.yaml.rpmsave
    • When Cassandra was installed with a binary tarball:
      ps auwx | grep cassandra
      $ sudo  kill cassandra_pid

      And then remove the Cassandra installation directory.

  8. Install DataStax Enterprise using one of the following:
  9. To configure the product, use your backup configuration files to merge any necessary modifications into the new configuration files.
  10. Start the node:
    • Packages and Installer-Services installations: See Starting DataStax Enterprise as a service (4.8, 5.0, 5.1).
    • Installer-No Services and Tarball installations: See Starting DataStax Enterprise as a stand-alone process (4.8, 5.0, 5.1).
  11. Optional: To ensure optimal performance, upgrade the SSTables on each node now that the upgrade is complete.
    nodetool upgradesstables
    If the SSTables are already on the current version, the command returns immediately and no action is taken.
  12. Review the logs for warnings, errors, and exceptions.
    Warnings, errors, and exceptions are frequently found in the logs when starting up an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.
  13. Verify that the upgraded datacenter names still match the datacenter names that are used in the keyspace schema definition:
    nodetool status
  14. Repeat the upgrade on each node in the cluster following the recommended order.