Upgrading from Apache Cassandra to DataStax Enterprise 

Upgrade instructions from Apache Cassandra to DataStax Enterprise.

Attention: You have probably seen the recommendation to read all the instructions. This is a time when careful review of the upgrade instructions will make a difference. By understanding what to do beforehand, you can ensure a smooth upgrade and avoid pitfalls and frustrations.

Read and understand these instructions before upgrading.

Upgrade order 

Upgrade order matters. Upgrade nodes in this order:

  • In multiple datacenter clusters, upgrade every node in one datacenter before moving on to another datacenter.
  • Upgrade the seed nodes within a datacenter first.
  • Upgrade analytics nodes or datacenters first, then Cassandra nodes or datacenters, and finally search nodes or datacenters.
Upgrade paths
Upgrades are impacted by the version you are upgrading from and the version you are upgrading to. The greater the gap between the current version and the target version, the more complex the upgrade. Upgrades from earlier versions might require an interim upgrade to a required version:
Current version Required interim version Target version
Apache Cassandra™ 3.x none DSE 5.1
Cassandra 3.0 DataStax Enterprise 5.0 DSE 5.1
Cassandra 2.1 DataStax Enterprise 4.8 DSE 5.0
Cassandra 2.0 and earlier Contact DataStax Support.  
General upgrade restrictions
  • Do not enable new features.
  • Do not run nodetool repair.
  • During the upgrade, do not bootstrap or decommission nodes.
  • Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
  • During the upgrade, the nodes on different versions might show a schema disagreement.
  • Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
  • Do not enable CDC on a mixed-version cluster. Upgrade all nodes to DataStax Enterprise 5.1 before enabling CDC.
Restrictions for nodes using any kind of security
  • Do not change security credentials or permissions until after the upgrade is complete.
Upgrading drivers and possible impact when driver versions are incompatible
Be sure to check driver compatibility. Your driver might not be compatible with the upgrade version or require re-compiling.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host that the driver connects to. To avoid driver version incompatibility during upgrades, use one of these workarounds:
  • Force a protocol version at startup. For example, keep the Java driver at v2 while the upgrade is happening. Switch to the Java driver v3 only after the entire cluster is upgraded.
  • Ensure that the list of initial contact points contains only hosts with the oldest driver version. For example, the initial contact points contain only Java driver v2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you're using, for example, Java driver.
Upgrade order 
Upgrade order matters. In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
  1. Analytics datacenters: seed nodes first, and then the rest of the analytics nodes.
  2. Cassandra (transactional) nodes or datacenters
  3. Search nodes or datacenters
With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded. Upgrade and restart the nodes one at a time. Other nodes in the cluster continue to operate at the earlier version until all nodes are upgraded.

dse.yaml 

The location of the dse.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/dse.yaml

Tarball installations
Installer-No Services installations

install_location/resources/dse/conf/dse.yaml

cassandra.yaml 

The location of the cassandra.yaml file depends on the type of installation:

Package installations
Installer-Services installations

/etc/cassandra/cassandra.yaml

Tarball installations
Installer-No Services installations

install_location/conf/cassandra.yaml

Procedure

Follow these steps on each node:

  1. Before upgrading to DataStax Enterprise from any Apache Cassandra™ version, DataStax recommends backing up your data prior to any version upgrade. A backup provides the ability to revert and restore all the data used in the previous version if necessary. See Backing up and restoring Cassandra data.
  2. Familiarize yourself with the changes and features in the release:
    • DataStax Enterprise release notes for 4.7, 4.8, 5.0, and 5.1.
    • General upgrade advice and Cassandra features in NEWS.txt/DSE CHANGES.txt. If you are upgrading from an earlier release, read NEWS.txt all the way back to your current version.
    • Ensure that your version of Cassandra can be upgraded directly to the version of Cassandra that is used by DataStax Enterprise. See the Cassandra changes in CHANGES.txt/DSE CHANGES.txt.
  3. Upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    $ nodetool upgradesstables
    Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.
  4. Run nodetool drain to flush the commit log of the old installation:
    $ nodetool drain -h hostname
    This step saves time when nodes start up after the upgrade, and prevents DSE Search nodes from having to reindex data.
  5. Stop the node. (2.1, 2.2, 3.0)
  6. Back up your configuration files.

    Back up your configuration to ensure that the configuration files are not overwritten with the default values.

  7. Uninstall Cassandra.
    If you installed Cassandra from packages in APT or RPM repositories, you must remove the packages before setting up and installing DataStax Enterprise from the appropriate repository.
    • For packages installed from APT repositories:
      $ sudo apt-get autoremove "dsc*" "cassandra*" "apache-cassandra*"

      This action shuts down Cassandra if it is still running.

    • For packages installed from Yum repositories:
      $ sudo yum remove "dsc*" "cassandra*" "apache-cassandra*"

      The old Cassandra configuration file might be renamed to cassandra.yaml.rpmsave, for example:

      warning: /etc/cassandra/default.conf/cassandra.yaml
      saved as /etc/cassandra/default.conf/cassandra.yaml.rpmsave
    • When Cassandra was installed with a binary tarball:
      $ ps auwx | grep cassandra
      $ sudo  kill cassandra_pid

      And then remove the Cassandra installation directory.

  8. Install DataStax Enterprise using one of the following:
  9. To configure the product, use your backup configuration files to merge any necessary modifications into the new configuration files.
  10. Start the node:
    • Packages and Installer-Services installations: See Starting DataStax Enterprise as a service (4.8, 5.0, 5.1).
    • Installer-No Services and Tarball installations: See Starting DataStax Enterprise as a stand-alone process (4.8, 5.0, 5.1).
  11. Optional: To ensure optimal performance, upgrade the SSTables on each node now that the upgrade is complete.
    $ nodetool upgradesstables
    If the SSTables are already on the current version, the command returns immediately and no action is taken.
  12. Review the logs for warnings, errors, and exceptions.
    Warnings, errors, and exceptions are frequently found in the logs when starting up an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.
  13. Verify that the upgraded datacenter names still match the datacenter names that are used in the keyspace schema definition:
    $ nodetool status
  14. Repeat the upgrade on each node in the cluster following the recommended upgrade order.