Upgrading DataStax Enterprise 6.8 to 6.9

The upgrade process for DataStax Enterprise (DSE) provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DSE until all of the nodes in the cluster are upgraded.

For databases with the tuple data type, upgrading to DSE 6.8.36 or later from a version earlier than 6.8.35 requires a two-step process to prevent data loss.

Upgrade databases with the tuple data type to 6.8.35 first. After you upgrade to 6.8.35, update the SSTables, then proceed with the upgrade to 6.8.36 or later. Upgrading directly to 6.8.36 or later without upgrading to 6.8.35 first will result in data loss.

Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrade instructions can prevent errors and data loss. In addition, review the DSE 6.9 release notes for all changes before upgrading.

DataStax Enterprise and Apache Cassandra® configuration files

DataStax Enterprise (DSE) configuration files
Configuration file Installer-Services and package installations Installer-No Services and tarball installations

dse

/etc/default/dse (systemd) or /etc/init.d/ (SystemV)

N/A Node type is set via command line flags.

dse-env.sh

/etc/dse/dse-env.sh

<installation_location>/bin/dse-env.sh

byoh-env.sh

/etc/dse/byoh-env.sh

<installation_location>/bin/byoh-env.sh

dse.yaml

/etc/dse/dse.yaml

<installation_location>/resources/dse/conf/dse.yaml

logback.xml

/etc/dse/cassandra/logback.xml

<installation_location>/resources/logback.xml

spark-env.sh

/etc/dse/spark/spark-env.sh

<installation_location>/resources/spark/conf/spark-env.sh

spark-defaults.conf

/etc/dse/spark/spark-defaults.conf

<installation_location>/resources/spark/conf/spark-defaults.conf

Cassandra configuration files

Configuration file

Installer-Services and package installations

Installer-No Services and tarball installations

cassandra.yaml

/etc/dse/cassandra/cassandra.yaml

<installation_location>/conf/cassandra.yaml

cassandra.in.sh

/usr/share/cassandra/cassandra.in.sh

<installation_location>/bin/cassandra.in.sh

cassandra-env.sh

/etc/dse/cassandra/cassandra-env.sh

<installation_location>/conf/cassandra-env.sh

cassandra-rackdc.properties

/etc/dse/cassandra/cassandra-rackdc.properties

<installation_location>/conf/cassandra-rackdc.properties

cassandra-topology.properties

/etc/dse/cassandra/cassandra-topology.properties

<installation_location>/conf/cassandra-topology.properties

jmxremote.password

/etc/cassandra/jmxremote.password

<installation_location>/conf/jmxremote.password

Tomcat server configuration file
Configuration file Installer-Services and package installations Installer-No Services and tarball installations

server.xml

/etc/dse/resources/tomcat/conf/server.xml

<installation_location>/resources/tomcat/conf/server.xml

Upgrade order

Upgrade nodes in this order:

  1. In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.

  2. Upgrade the seed nodes within a datacenter first.

  3. DSE Analytics datacenters

    1. For DSE Analytics nodes using DSE Hadoop, upgrade the Job Tracker node first. Then upgrade Hadoop nodes, followed by Spark nodes.

  4. Transactional/DSE Graph datacenters

  5. DSE Search nodes or datacenters

Back up your existing installation

DataStax recommends backing up your data prior to any version upgrade.

A backup provides the ability to revert and restore all the data used in the previous version if necessary. For manual backup instructions, see Backing up a tarball installation or Backing up a package installation.

Instead of manual processes, automate the management of enterprise-wide backup and restore cluster operations using Mission Control or OpsCenter. Ensure you use a compatible version of OpsCenter for your DSE version.

Upgrade SSTables

Ensure you upgrade SSTables on your nodes both before and after upgrading the software binaries. Failure to upgrade SSTables will result in severe performance penalties and possible data loss.

Upgrade restrictions and limitations

Restrictions and limitations apply while a cluster is in a partially upgraded state. The cluster operates as if it’s still running the previous version of DataStax Enterprise until all nodes in the cluster have been upgraded.

General restrictions

  • Do not enable new features.

  • Ensure Mission Control or OpsCenter compatibility.

    Compatibility
    OpsCenter version DSE version

    6.8

    6.8, 6.7, 6.0, 5.1

    6.7

    DSE 6.0

    6.5

    6.0, 5.1, 5.0 (EOL)

    6.1

    5.1, 5.0, 5.0 (EOL)

    6.0

    5.0 (EOL), 4.8 (EOSL), 4.7 (EOSL)

  • Do not run nodetool repair.

  • Stop the OpsCenter Repair Service if enabled.

  • During the upgrade, do not bootstrap new nodes or decommission existing nodes.

  • Do not issue TRUNCATE or DDL related queries during the upgrade process.

  • Do not alter schemas for any workloads.

  • Complete the cluster-wide upgrade before the expiration of gc_grace_seconds (default 10 days) to ensure any repairs complete successfully.

  • If you disabled the DSE Performance Service before the upgrade, do not enable it during the upgrade.

Nodes on different versions might show a schema disagreement during an upgrade.

Restrictions for nodes using security

  • Do not change security credentials or permissions until the upgrade is complete on all nodes.

  • If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.

Restrictions for DSE Analytics nodes

Spark versions change between major DSE versions. DSE release notes [ 6.8.x | 6.9.x] indicate which version of Spark is used.

When upgrading to a major version of DSE, all nodes in a DSE datacenter that run Spark must be on the same version of Spark and the Spark jobs must be compiled for that version. Each datacenter acting as a Spark cluster must be on the same upgraded DSE version before reinitiating Spark jobs.

In the case where Spark jobs run against Graph keyspaces, you must update all of the nodes in the cluster first to avoid Spark jobs failing. === Restrictions for DSE Analytics nodes Spark versions change between major DSE versions. DSE release notes [ 6.8.x | 6.9.x] indicate which version of Spark is used.

When upgrading to a major version of DSE, all nodes in a DSE datacenter that run Spark must be on the same version of Spark and the Spark jobs must be compiled for that version. Each datacenter acting as a Spark cluster must be on the same upgraded DSE version before reinitiating Spark jobs.

In the case where Spark jobs run against Graph keyspaces, you must update all of the nodes in the cluster first to avoid Spark jobs failing. === Restrictions for DSE Advanced Replication nodes

The system supports upgrades for only DSE Advanced Replication v2.

Driver version impacts

Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code.

During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host to which the driver connects, although certain drivers, such as Java 4.x/2.x automatically select a protocol version that works across nodes. To avoid driver version incompatibility during upgrades, use one of these workarounds:

  • Protocol version: Set the protocol version explicitly in your application at start up. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.

  • Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest DSE version or protocol version. For example, the initial contact points contain only protocol version 2.

For details on protocol version negotiation, check the documentation for the driver you’re using. For example, Protocol version with mixed clusters in the Java driver.

Starting in January 2020, you can use the same DataStax driver for Apache Cassandra (OSS) and DataStax Enterprise. DataStax provides unified drivers to avoid user confusion and enhance the OSS drivers with some of the features in the DSE drivers. For more information, see the Better Drivers for Cassandra blog.

Preparing to upgrade

Follow these steps to prepare each node for the upgrade:

Perform these steps in your DSE 6.8 version using the latest documentation.

  1. Upgrade to the latest patch release on your current version. The latest patch release includes fixes that can simplify the upgrade process.

    Get the current DSE version:

    bin/dse -v
    current_dse_version
  2. Familiarize yourself with the changes and features in the new release:

  3. Before upgrading, be sure that each node has adequate free disk space.

    Determine current DSE data disk space usage:

    $ sudo du -sh /var/lib/cassandra/data/
    Results
    3.9G	/var/lib/cassandra/data/

    Determine available disk space:

    $ sudo df -hT /
    Results
    Filesystem     Type  Size  Used Avail Use% Mounted on
    /dev/sda1      ext4   59G   16G   41G  28% /

    The required space depends on the compaction strategy. See Disk space.

  4. Upgrade the SSTables on each node to ensure that all SSTables are on the current version:

    $ nodetool upgradesstables

    Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time or when using racks, one rack at a time.

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

  5. Ensure that keyspace replication settings are correct for your environment:

    $ cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication"
    CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, '**replication_factor**': '3'}  AND durable_writes = true;
  6. Verify the Java runtime version and upgrade to the recommended version.

    $ java -version
    Results
    java version "11.0.18" YYYY-MM-DD LTS
    Java(TM) SE Runtime Environment 18.9 (build 11.0.18+xx-LTS-219)
    Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.18+xx-LTS-219, mixed mode)
  7. Run nodetool repair to ensure that data on each replica is consistent with data on other nodes:

    nodetool repair -pr
  8. Back up any customized configuration files since they may be overwritten with default values during installation of the new version.

    If you backed up your installation using the instructions in Backing up a tarball installation or Backing up a package installation, the system includes your original configuration files in the archive.

Upgrade steps

Follow these steps on each node in the recommended Upgrade order. The upgrade process requires upgrading and restarting one node at a time.

Perform these steps in your upgraded version, DSE 6.9.x.

For the package and tarball installations, you must move the parameters from the jvm.options file to the jvm-server.options file as jvm.options is deprecated. The jvm.options file must also be removed from the installation directory.

  1. Flush the commit log of the current installation:

    nodetool drain
  2. Stop the node:

    • Package installations:

      sudo service dse stop
    • Tarball installations:

      installation_dir/bin/dse cassandra-stop
  3. Use the appropriate method to install the new product version on a supported platform:

  4. To configure the new version:

    1. Compare changes in the new configuration files with the backup configuration files after the upgrade but before restarting, remove deprecated settings, and update any new settings if required.

      You must use the new configuration files that are generated from the upgrade installation. Copy any parameters needed from your old configuration files into these new files.

      Do not replace the newly-generated configuration files with the old files.

      Use the DSE yaml_diff tool to compare backup YAML files with the upgraded YAML files:

      cd /usr/share/dse/tools/yamls
      ./yaml_diff path/to/yaml-file-old path/to/yaml-file-new
      Results
      ...
       CHANGES
      =========
      authenticator:
      - AllowAllAuthenticator
      + com.datastax.bdp.cassandra.auth.DseAuthenticator
      
      authorizer:
      - AllowAllAuthorizer
      + com.datastax.bdp.cassandra.auth.DseAuthorizer
      
      roles_validity_in_ms:
      - 2000
      + 120000
      ...
  5. Start the node.

  6. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:

    • Get the node’s datacenter name:

      nodetool status | grep "Datacenter"
      Datacenter: datacenter-name
    • Verify that the node’s datacenter name matches the datacenter name for a keyspace:

      cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication"
      CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'datacenter-name': '3'};
  7. Review the logs for warnings, errors, and exceptions:

    grep -w 'WARNING\|ERROR\|exception' /var/log/cassandra/*.log

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

    You will configure non-standard log locations in dse-env.sh.

    The default location of this file depends on the type of installation:

    Package installations

    /etc/dse/dse-env.sh

    Tarball installations

    <installation_location>/bin/dse-env.sh

  8. Repeat the upgrade process on each node in the cluster following the recommended Upgrade order.

  9. After the entire cluster upgrade is complete: upgrade the SSTables on one node at a time or, when using racks, one rack at a time.

    Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage and possible data loss. Upgrading is not complete until the SSTables are upgraded.

    nodetool upgradesstables

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time or, when using racks, one rack at a time.

    You can run the upgradesstables command before all the nodes are upgraded as long as you run the command on only one node at a time, or, when using racks, one rack at a time. Running upgradesstables on too many nodes at once degrades performance.

General post-upgrade steps

After all nodes are upgraded:

  1. If you use the OpsCenter Repair Service, turn it on.

  2. Starting with DSE 6.7, the DSE Metrics Collector is enabled by default. This is a diagnostics information aggregator used to help facilitate DSE problem resolution. For more information on the DSE Metrics Collector, or to disable metrics collection, see DataStax Enterprise Metrics Collector.

  3. Spark Jobserver uses DSE custom version 8.0.4.45. Ensure that applications use the compatible Spark Jobserver API from the DataStax repository.

Locking DSE package versions

If you have upgraded a DSE package installation, you can prevent future unintended upgrades.

RHEL yum installations

To hold a package at the current version:

  1. Install yum-versionlock (one-time operation):

    sudo yum install yum-versionlock
  2. Lock the current DSE version:

    sudo yum versionlock dse-*

To clear the version lock and enable upgrades:

sudo yum versionlock clear

For details, see versionlock command.

Debian apt-get installations

To hold a package at the current version:

sudo apt-mark hold dse-*

To remove the version hold:

sudo apt-mark unhold dse-*

For details, see the apt-mark command.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com