Upgrading to DataStax Enterprise 4.6

Instructions to upgrade to DataStax Enterprise 4.6 from DataStax Enterprise versions 3.2.5 to 4.5.

DataStax Enterprise and Apache Cassandra™ configuration files

Configuration file Installer-Services and package installations Installer-No Services and tarball installations
DataStax Enterprise configuration files
dse /etc/default/dse (systemd) or /etc/init.d/ (SystemV) N/A. Node type is set via command line flags.
byoh-env.sh /etc/dse/byoh-env.sh install_location/bin/byoh-env.sh
dse.yaml /etc/dse/dse.yaml install_location/resources/dse/conf/dse.yaml
logback.xml /etc/dse/cassandra/logback.xml install_location/resources/logback.xml
spark-env.sh /etc/dse/spark/spark-env.sh install_location/resources/spark/conf/spark-env.sh
spark-defaults.conf /etc/dse/spark/spark-defaults.conf install_location/resources/spark/conf/spark-defaults.conf
Cassandra configuration files
cassandra.yaml /etc/dse/cassandra/cassandra.yaml install_location/conf/cassandra.yaml
cassandra.in.sh /usr/share/cassandra/cassandra.in.sh install_location/bin/cassandra.in.sh
cassandra-env.sh /etc/dse/cassandra/cassandra-env.sh install_location/conf/cassandra-env.sh
cassandra-rackdc.properties /etc/dse/cassandra/cassandra-rackdc.properties install_location/conf/cassandra-rackdc.properties
cassandra-topology.properties /etc/dse/cassandra/cassandra-topology.properties install_location/conf/cassandra-topology.properties
jmxremote.password /etc/cassandra/jmxremote.password install_location/conf/jmxremote.password
Tomcat server configuration file
server.xml /etc/dse/resources/tomcat/conf/server.xml install_location/resources/tomcat/conf/server.xml

DataStax driver changes

DataStax drivers come in two types:

  • DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
  • DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
Note: While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise features.

dse.yaml

The location of the dse.yaml file depends on the type of installation:

Package installations
                  Installer-Services installations (DSE 4.5 to 5.1)

/etc/dse/dse.yaml

Tarball installations
                  Installer-No Services installations (DSE 4.5 to 5.1)

install_location/resources/dse/conf/dse.yaml

Upgrading major Cassandra version

Upgrading SSTables is required for upgrades that contain major Apache Cassandra releases:
  • DataStax Enterprise 6.7 is compatible with Cassandra 3.11.
  • DataStax Enterprise 6.0 is compatible with Cassandra 3.11.
  • DataStax Enterprise 5.1 uses Cassandra 3.11.
  • DataStax Enterprise 5.0 uses Cassandra 3.0.
  • DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.
  • DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.

Upgrade order

Upgrade nodes in this order:
  • In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
  • Upgrade the seed nodes within a datacenter first.
  • Upgrade nodes in this order:
    1. DSE Analytics datacenters
    2. Transactional/DSE Graph datacenters
    3. DSE Search datacenters

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:

Package installations
                  Installer-Services installations (DSE 4.5 to 5.1)

/etc/cassandra/cassandra.yaml

Tarball installations
                  Installer-No Services installations (DSE 4.5 to 5.1)

install_location/conf/cassandra.yaml

Follow these instructions to upgrade from DataStax Enterprise versions 3.2.5 to 4.5 to DataStax Enterprise 4.6.

Tip: DataStax is offering a complimentary half-day Upgrade Assessment. This assessment is a DataStax Services engagement designed to assess the upgrade compatibility of your existing DSE deployment to later DSE versions, including 5.1, 6.0, and 6.7. Contact the DataStax Services team to schedule your assessment.
Attention: Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrade instructions can prevent errors and data loss.

Cassandra version change

Upgrading from DataStax Enterprise 3.2.5 to 4.6 includes a major Cassandra version change. Be sure to follow the recommendations for upgrading the SSTables.

General recommendations

DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary.

OpsCenter provides a Backup Service that manages enterprise-wide backup and restore operations for DataStax Enterprise clusters. OpsCenter 6.5 and later is recommended.

Upgrade restrictions and limitations

Restrictions and limitations apply while a cluster is in a partially upgraded state.

With these exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

General upgrade restrictions durning an upgrade
  • Do not enable new features.
  • During the upgrade, do not bootstrap or decommission nodes.
  • Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
  • Do not enable Change Data Capture (CDC) on a mixed-version cluster. Upgrade all nodes to DSE 5.1 or later before enabling CDC.
Note: Nodes on different versions might show a schema disagreement during an upgrade.
DSE Graph nodes restrictions
Graph nodes have the same restrictions as the workload they run on. Do not alter graph schema during upgrades. Workload-specific restrictions apply for analytics and search nodes, such as no OLAP queries during upgrades.
Restrictions for DSE Analytic (Hadoop and Spark) nodes
  • Do not run analytics jobs until all nodes are upgraded.
  • Kill all Spark worker processes before you stop the node and install the new version.
Restrictions for DSE Analytic (Spark) nodes
  • Do not run analytics jobs until all nodes are upgraded.
Restrictions for DSE Analytic (Spark) nodes
  • Do not run analytics jobs until all nodes are upgraded.
  • All nodes in the cluster must be upgraded to the new version before Spark Worker and Spark Master will start.
Restrictions for DSE Analytic (Spark) nodes
  • Do not run analytics jobs until all nodes are upgraded.
  • Kill all Spark worker processes before you stop the node and install the new version.
DSE Search upgrade restrictions and limitations
  • Do not update schemas.
  • Do not reindex DSE Search nodes during upgrade.
  • DSE 6.0 and later versions use a new Lucene codec. Segments written with this new codec cannot be read by earlier versions of DSE. To downgrade to earlier versions, the entire data directory for the search index in question must be cleared.
    • DSE Search in DataStax Enterprise 6.7 uses Apache Solr 6.0. This significant change requires advanced planning and specific actions before and after the upgrade.
DSE Search upgrade restrictions and limitations
  • Do not update schemas.
  • Do not reindex DSE Search nodes during upgrade.
  • DSE 6.0 and later versions use a new Lucene codec. Segments written with this new codec cannot be read by earlier versions of DSE. To downgrade to earlier versions, the entire data directory for the search index in question must be cleared.
  • DSE Search in DataStax Enterprise 5.1 and later uses Apache Solr 6.0. This significant change requires advanced planning and specific actions before and after the upgrade.
Important: Before you upgrade DSE Search or SearchAnalytics workloads, you must follow the specific tasks in the section.
DSE Search upgrade restrictions and limitations
  • Do not update schemas.
  • Do not reindex DSE Search nodes during upgrade.
Restrictions for DSE Search (Solr) nodes
  • Do not update schemas.
  • Do not reindex DSE Search nodes during upgrade.
  • Do not issue these types of queries during a rolling restart: BATCH or TRUNCATE.
  • During the upgrade process on a cluster with mixed versions where DataStax Enterprise 4.7 or 4.8 supports pagination and earlier versions do not, issuing queries from the upgraded nodes will return only FetchSize results.
Restrictions for nodes using any kind of security
  • Do not change security credentials or permissions until the upgrade is complete on all nodes.
  • If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.
Restrictions for nodes using any kind of security
  • Do not change security credentials or permissions until after the upgrade is complete.
Upgrading drivers and possible impact when driver versions are incompatible
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host that the driver connects to. To avoid driver version incompatibility during upgrades, use one of these workarounds:
  • Protocol version: Because some drivers can use different protocol versions, force the protocol version at start up. For example, keep the Java driver at its current protocol version while the driver upgrade is happening. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
  • Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest driver version. For example, the initial contact points contain only Java driver v2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you're using, for example, Java driver.

Preparing to upgrade from 3.2.5 and later to 4.6.

Tip: The DataStax installer upgrades DataStax Enterprise and automatically performs many upgrade tasks.
If you do not use the DataStax installer, follow these steps on each node to prepare to upgrade from DataStax Enterprise 3.2.5 and later to DataStax Enterprise 4.6.
  1. Before upgrading, be sure that each node has adequate free disk space.

    The required space depends on the compaction strategy. See Disk space.

  2. Verify your current product version. If necessary, upgrade to one these required interim versions before upgrading to 4.6:
    • DataStax Enterprise 3.2.5 and later
    • DataStax Community or open source Apache Cassandra™ 1.2.16
  3. Upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    nodetool upgradesstables
    This step is required for DataStax Enterprise upgrades that include a major Cassandra version changes.
    Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

  4. Only for upgrades from 3.2.x: Edit the cassandra.yaml file and remove or comment out the following options:
    # auth_replication_options:
    # replication_factor: 1
  5. Only for upgrades from 4.0.0 with Search nodes to 4.5: See Upgrading from DataStax Enterprise 4.0.0 with search nodes.
  6. Verify the Java runtime version and upgrade to the recommended version.
    java -version
    The latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended. The JDK is recommended for development and production systems. The JDK provides useful troubleshooting tools that are not in the JRE, such as jstack, jmap, jps, and jstat.
    Note: If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40.
  7. Familiarize yourself with the changes and features in this release:
    • DataStax Enterprise 4.6 release notes.
      Endpoint snitch: Starting in DataStax Enterprise 4.6, the endpoint snitch is set in cassandra.yaml, not dse.yaml. The com.datastax.bdp.snitch.DseDelegateSnitch is replaced by com.datastax.bdp.snitch.DseSimpleSnitch in cassandra.yaml and the endpoint_snitch option has been removed from dse.yaml.
      Note: The DataStax Installer automatically sets the default endpoint_snitch to DseSimpleSnitch and removes the option from the dse.yaml file.
    • General upgrade advice and Apache Cassandra features in NEWS.txt. If you are upgrading from an earlier release, read NEWS.txt all the way back to your current version.
    • Apache Cassandra changes in CHANGES.txt.
  8. Back up the configuration files you use to a folder that is not in the directory where you normally run commands.

    The configuration files are overwritten with default values during installation of the new version.

  9. Run nodetool repair to ensure that data on each replica is consistent with data on other nodes.

Upgrading from 3.2.5 and later to 4.6

The upgrade process for DataStax Enterprise provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

Follow these steps on each node to upgrade from DataStax Enterprise 3.2.5 and later to DataStax Enterprise 4.6.

  1. Upgrade order matters. Upgrade nodes in this order:
    • In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
    • Upgrade the seed nodes within a datacenter first.

      For DSE Analytics nodes using DSE Hadoop, upgrade the Job Tracker node first. Then upgrade Hadoop nodes, followed by Spark nodes.

    • Upgrade nodes in this order:
      1. DSE Analytics datacenters
      2. Transactional/DSE Graph datacenters
      3. DSE Search datacenters
    With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded. Upgrade and restart the nodes one at a time. Other nodes in the cluster continue to operate at the earlier version until all nodes are upgraded.
  2. To flush the commit log of the old installation:
    nodetool -h hostname drain
    This step saves time when nodes start up after the upgrade, and prevents DSE Search nodes from having to reindex data.
    Important: This step is mandatory when upgrading between major Cassandra versions that change SSTable formats, rendering commit logs from the previous version incompatible with the new version.
  3. DSE Analytics nodes: Kill all Spark worker processes.
  4. Stop the node (Stop the node.
  5. Use the appropriate installation type to install the new product version on a supported platform:
    Note: Install the new product version using the same installation type that is on the system. The upgrade proceeds with installation regardless of the installation type. If you use a different installation type, the upgrade might result in issues.
  6. Open cassandra.yaml to set the endpoint_snitch option to the same snitch that is set in delegated_snitch in dse.yaml:
    endpoint_snitch: com.datastax.bdp.snitch.DseSimpleSnitch
  7. Remove the delegated_snitch option from the old dse.yaml file.
  8. To configure the new version, use your backup configuration files to merge modifications into the configuration files for the new version.
  9. Start the node:
  10. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:
    nodetool status
  11. Review the logs for warnings, errors, and exceptions.

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

  12. Repeat the upgrade on each node in the cluster following the recommended order.
  13. When the upgrade includes a major Cassandra version, you must upgrade the SSTables. DataStax recommends upgrading the SSTables on one node at a time or when using racks, one rack at a time.
    Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
    nodetool upgradesstables

    If the SSTables are already on the current version, the command returns immediately and no action is taken. See SSTable compatibility and upgrade version.

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running the upgradesstables command on one node at a time or when using racks, one rack at a time.

    Note: You can run the upgradesstables command before all the nodes are upgraded as long as you run this command on only one node at a time or when using racks, one rack at a time. Running upgradesstables on too many nodes will degrade performance.