Upgrading to DataStax Enterprise 3.2

Follow these instructions to upgrade to DataStax Enterprise 3.2.x.

DataStax Enterprise and Apache Cassandra™ configuration files

Configuration file Installer-Services and package installations Installer-No Services and tarball installations
DataStax Enterprise configuration files
byoh-env.sh /etc/dse/byoh-env.sh install_location/bin/byoh-env.sh
dse.yaml /etc/dse/dse.yaml install_location/resources/dse/conf/dse.yaml
logback.xml /etc/dse/cassandra/logback.xml install_location/resources/logback.xml
spark-env.sh /etc/dse/spark/spark-env.sh install_location/resources/spark/conf/spark-env.sh
spark-defaults.conf /etc/dse/spark/spark-defaults.conf install_location/resources/spark/conf/spark-defaults.conf
Cassandra configuration files
cassandra.yaml /etc/cassandra/cassandra.yaml install_location/conf/cassandra.yaml
cassandra.in.sh /usr/share/cassandra/cassandra.in.sh install_location/bin/cassandra.in.sh
cassandra-env.sh /etc/cassandra/cassandra-env.sh install_location/conf/cassandra-env.sh
cassandra-rackdc.properties /etc/cassandra/cassandra-rackdc.properties install_location/conf/cassandra-rackdc.properties
cassandra-topology.properties /etc/cassandra/cassandra-topology.properties install_location/conf/cassandra-topology.properties
jmxremote.password /etc/cassandra/jmxremote.password install_location/conf/jmxremote.password
Tomcat server configuration file
server.xml /etc/dse/resources/tomcat/conf/server.xml install_location/resources/tomcat/conf/server.xml

DataStax driver changes

DataStax drivers come in two types:

  • DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
  • DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
Note: While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise features.

Upgrading major Cassandra version

Upgrading SSTables is required for upgrades that contain major Apache Cassandra releases:
  • DataStax Enterprise 6.7 is compatible with Cassandra 3.11.
  • DataStax Enterprise 6.0 is compatible with Cassandra 3.11.
  • DataStax Enterprise 5.1 uses Cassandra 3.11.
  • DataStax Enterprise 5.0 uses Cassandra 3.0.
  • DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.
  • DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.
Attention: Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrading instructions can ensure a smooth upgrade and avoid pitfalls and frustrations.

Cassandra version change

Upgrading to DataStax Enterprise 3.2 includes a major Cassandra version change. Be sure to follow the recommendations for upgrading the SSTables.

General recommendations

DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary.

Upgrade limitations

Limitations apply while a cluster is in a partially upgraded state.

With these exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

General upgrade restrictions durning an upgrade
  • Do not enable new features.
  • Do not run nodetool repair. If you have the OpsCenter Repair Service configured, turn off the Repair Service.
  • Ensure OpsCenter compatibility. See DataStax OpsCenter compatibility with DataStax Enterprise.
  • During the upgrade, do not bootstrap or decommission nodes.
  • Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
  • Do not enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes to DSE 5.1 or later before enabling CDC.
  • Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
Note: Nodes on different versions might show a schema disagreement during an upgrade.
  • NodeSync waits to start until all nodes are upgraded.
  • Do not enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes before enabling CDC.
  • Ensure OpsCenter compatibility. OpsCenter 6.7 is required for managing DSE 6.7 clusters. See DSE OpsCenter compatibility with DataStax Enterprise.
Security upgrade limitations
  • Do not change security credentials or permissions until after the upgrade is complete.
Upgrading drivers and possible impact when driver versions are incompatible
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host that the driver connects to. To avoid driver version incompatibility during upgrades, use one of these workarounds:
  • Protocol version: Because some drivers can use different protocol versions, force the protocol version at start up. For example, keep the Java driver at its current protocol version while the driver upgrade is happening. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
  • Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest driver version. For example, the initial contact points contain only Java driver v2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you're using, for example, Java driver.

Preparing to upgrade from DataStax Enterprise 2.2.2 and later to DataStax Enterprise 3.2

Tip: The DataStax installer upgrades DataStax Enterprise and automatically performs many upgrade tasks.
If you do not use the DataStax installer, follow these steps to prepare to upgrade from DataStax Enterprise 2.2.2 and later to DataStax Enterprise 3.2.
  1. Before upgrading, be sure that each node has ample free disk space.

    The required space depends on the compaction strategy. See Disk space.

  2. Verify your current product version. If necessary, upgrade to one these required interim versions before upgrading to 3.2:
    • DataStax Enterprise 2.2.2 and later
    • DataStax Community or open source Apache Cassandra™ 1.1.9
    • DataStax Community or open source Apache Cassandra 1.2.9 to 1.2.15
  3. For upgrades from DataStax Enterprise 3.0.x and 2.2.x, review and observe the specific actions in:
  4. Verify the Java runtime version and upgrade to the recommended version.
    java -version
    The latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended. The JDK is recommended for development and production systems. The JDK provides useful troubleshooting tools that are not in the JRE, such as jstack, jmap, jps, and jstat.
    Note: If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40.
  5. Familiarize yourself with the changes and features in this release:
    • DataStax Enterprise release notes for 3.2.
    • General upgrade advice and Apache Cassandra features in NEWS.txt. If you are upgrading from an earlier release, read NEWS.txt all the way back to your current version.
    • Apache Cassandra changes in CHANGES.txt.
  6. For upgrades from DataStax Enterprise 2.1.x with search nodes, see Solr restrictions.
  7. Upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    nodetool upgradesstables
    This step is required for DataStax Enterprise upgrades that include a major Cassandra version changes.
    Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

  8. Back up the configuration files you use to a folder that is not in the directory where you normally run commands.

    The configuration files are overwritten with default values during installation of the new version.

  9. Upgrade order matters. Using the following guidelines, upgrade nodes in the recommended order:
    • In multiple datacenter clusters, upgrade all the nodes within one datacenter before moving on to another datacenter.
    • Upgrade the seed nodes within a datacenter first.
    • Upgrade analytics nodes or datacenters first, then transactional nodes or datacenters, and finally search nodes or datacenters.
    • For analytics nodes, upgrade the Job Tracker node first. Then upgrade Hadoop nodes.

Upgrading to DataStax Enterprise 3.2

Follow these steps to upgrade to DataStax Enterprise 3.2.
  1. Run nodetool drain to flush the commit log of the old installation:
    nodetool -h hostname drain 
    This step saves time when nodes start up after the upgrade.
  2. Stop the node.
  3. Use the appropriate method to install the new product version on a supported platform:
    Note: Install the new product version using the same installation method that is on the system. The upgrade proceeds with installation regardless of the installation method and might result in issues.
  4. To configure the new version, use your backup configuration files to merge modifications into the configuration files for the new version.
  5. Only for upgrades from 2.2.x and 3.0.x to 3.2.x, edit the cassandra.yaml file to change the partitioner setting to match the previous partitioner. The RandomPartitioner (org.apache.cassandra.dht.RandomPartitioner) was the default partitioner in DataStax Enterprise 2.2.x and 3.0.x which used Apache Cassandra 1.2.
  6. Only for upgrades from 3.1.x to 3.2.0, temporarily enable the old Gossip protocol in a cluster.
    After installing the new version, but before the first restart of each node, enable the old protocol so that each upgraded node can connect to the nodes awaiting the upgrade. Add the following line to /etc/cassandra/cassandra-env.sh for packaged installs or install_location/conf/cassandra-env.sh for tarball installs:
    VM_OPTS="$JVM_OPTS -Denable-old-dse-state=true
    After upgrading the entire cluster, remove this line from cassandra-env.sh on each node so it uses the new protocol, and then perform a second rolling restart.
  7. Start the node.
  8. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:
    nodetool status
  9. Review the logs for warnings, errors, and exceptions.

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

    For upgrades from DataStax Enterprise 3.0.x, ignore these expected error messages:
    • An exception that looks something like this might appear in logs during a rolling upgrade.
      ERROR 15:36:54,908 Exception in thread Thread[GossipStage:1,5,main ]
       java.lang.NumberFormatException: For input string:  "127605887595351923798765477786913079296"
      . . .
    • When upgrading Cassandra 1.2 nodes, messages that are related to a node that is attempting to push mutations to the new system_auth keyspace:
      ERROR [WRITE-/192.168.123.11] 2013-06-22 14:13:42,336 OutboundTcpConnection.java (line 222)
       error writing to /192.168.123.11
      java.lang.RuntimeException: Can't serialize ColumnFamily ID 2d324e48-3275-3517-8dd5-9a2c5b0856c5
      to be used by version 5, because int <-> uuid mapping could not be established
      (CF was created in mixed version cluster).
      at org.apache.cassandra.db.ColumnFamilySerializer.cfIdSerializedSize(ColumnFamilySerializer.java:196)
    • For upgrades on Solr nodes:
      ERROR 00:57:17,785 Cannot activate core: ks.cf_10000_keys_50_cols
      ERROR 00:57:17,786 <indexDefaults> and <mainIndex> configuration sections are discontinued.
       Use <indexConfig> instead.
      ERROR 01:29:55,145 checksum mismatch in segments file (resource:
        ChecksumIndexInput (MMapIndexInput ( path = "/var/lib/cassandra/data/solr.data/ks.   cf_10000_keys_50_cols/index/segments_6" )))
      ERROR 01:29:55,145 Solr index ks.cf_10000_keys_50_cols seems to be corrupted:
        please CREATE the core again with  recovery = true to start reindexing data.
      ERROR 01:29:55,145 Cannot activate core: ks.cf_10000_keys_50_cols
      ERROR 01:29:55,146 checksum mismatch in segments file  (resource: ChecksumIndexInput
         (MMapIndexInput ( path = "/var/lib/cassandra/data/solr.data/ks.   cf_10000_keys_50_cols/index/segments_6" )))
      org.apache.lucene.index.CorruptIndexException: checksum mismatch in segments file
         (resource: ChecksumIndexInput (MMapIndexInput
         ( path = "/var/lib/cassandra/data/solr.data/ks.cf_10000_keys_50_cols/index/segments_6" )))
  10. Repeat the upgrade on each node in the cluster following the recommended order:
    • In multiple datacenter clusters, upgrade every node in one datacenter before moving on to another datacenter.
    • Upgrade the seed nodes within a datacenter first.
    • Upgrade DSE Analytics nodes or datacenters first, then Cassandra nodes or datacenters, and finally DSE Search nodes or datacenters.
    • For DSE Analytics nodes, upgrade the Job Tracker node first. Then upgrade Hadoop nodes, followed by Spark nodes.
  11. Only for upgrades from 3.1.x to 3.2.0, after the upgrade and before the first restart of each node, enable the old protocol so that each upgraded node can connect to the nodes awaiting the upgrade.
    1. Remove the following line from /etc/cassandra/cassandra-env.sh for packaged installs or install_location/conf/cassandra-env.sh for tarball installs:
      VM_OPTS="$JVM_OPTS -Denable-old-dse-state=true
    2. After removing the line from cassandra-env.sh, perform a second rolling restart.
  12. Only for upgrades from 3.0 and 3.1.x When upgrading from earlier versions, the first upgraded node will automatically alter dse_system to use the EverywhereStrategy and attempt to run nodetool repair dse_system. This operation might fail if other nodes are down during the upgrade. Review /var/log/cassandra/system.log for errors or warnings. If automatic switching fails, after all the nodes are up, manually update the dse_system keyspace to use EverywhereStrategy. In cqlsh, enter:
    ALTER KEYSPACE dse_system WITH replication = {'class': 'EverywhereStrategy'};
    Then enter the following command on all nodes:
    nodetool repair dse_system
  13. When the upgrade includes a major Cassandra version, you must upgrade the SSTables. DataStax recommends upgrading the SSTables on one node at a time or when using racks, one rack at a time.
    Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
    nodetool upgradesstables

    If the SSTables are already on the current version, the command returns immediately and no action is taken. See SSTable compatibility and upgrade version.

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time or when using racks, one rack at a time.

    Note: You can run the upgradesstables command before all the nodes are upgraded as long as you run this command on only one node at a time or when using racks, one rack at a time. Running upgradesstables on too many nodes will degrade performance.