Upgrading from DataStax Enterprise 6.0 to 6.7

Instructions for upgrading from DSE 6.0 to 6.7.

dse.yaml

The location of the dse.yaml file depends on the type of installation:
Package installations /etc/dse/dse.yaml
Tarball installations installation_location/resources/dse/conf/dse.yaml

Upgrade order

Upgrade nodes in this order:
  • In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
  • Upgrade the seed nodes within a datacenter first.
  • Upgrade nodes in this order:
    1. DSE Analytics datacenters
    2. Transactional/DSE Graph datacenters
    3. DSE Search datacenters
OpsCenter version DSE version
6.7 6.7, 6.0, 5.1
6.5 6.0, 5.1, 5.0 (EOL)
6.1 5.1, 5.0 (EOL), 4.8 (EOSL)
6.0 5.0 (EOL), 4.8 (EOSL), 4.7 (EOSL)

DataStax driver changes

DataStax drivers come in two types:

  • DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
  • DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
Note: While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise features.

DataStax Enterprise and Apache Cassandra™ configuration files

Configuration file Installer-Services and package installations Installer-No Services and tarball installations
DataStax Enterprise configuration files
byoh-env.sh /etc/dse/byoh-env.sh install_location/bin/byoh-env.sh
dse.yaml /etc/dse/dse.yaml install_location/resources/dse/conf/dse.yaml
logback.xml /etc/dse/cassandra/logback.xml install_location/resources/logback.xml
spark-env.sh /etc/dse/spark/spark-env.sh install_location/resources/spark/conf/spark-env.sh
spark-defaults.conf /etc/dse/spark/spark-defaults.conf install_location/resources/spark/conf/spark-defaults.conf
Cassandra configuration files
cassandra.yaml /etc/cassandra/cassandra.yaml install_location/conf/cassandra.yaml
cassandra.in.sh /usr/share/cassandra/cassandra.in.sh install_location/bin/cassandra.in.sh
cassandra-env.sh /etc/cassandra/cassandra-env.sh install_location/conf/cassandra-env.sh
cassandra-rackdc.properties /etc/cassandra/cassandra-rackdc.properties install_location/conf/cassandra-rackdc.properties
cassandra-topology.properties /etc/cassandra/cassandra-topology.properties install_location/conf/cassandra-topology.properties
jmxremote.password /etc/cassandra/jmxremote.password install_location/conf/jmxremote.password
Tomcat server configuration file
server.xml /etc/dse/resources/tomcat/conf/server.xml install_location/resources/tomcat/conf/server.xml

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

Upgrading major Cassandra version

Upgrading SSTables is required for upgrades that contain major Apache Cassandra releases:
  • DataStax Enterprise 6.7 is compatible with Cassandra 3.11.
  • DataStax Enterprise 6.0 is compatible with Cassandra 3.11.
  • DataStax Enterprise 5.1 uses Cassandra 3.11.
  • DataStax Enterprise 5.0 uses Cassandra 3.0.
  • DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.
  • DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.

The upgrade process for DataStax Enterprise provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

Follow these instructions to upgrade from DataStax Enterprise (DSE) 6.0 to DSE 6.7.

Always upgrade to latest patch release on your current version before you upgrade to a higher version. Fixes included in the latest patch release might help or smooth the upgrade process.

The latest 6.0.x version of DSE is 6.0.10.

Attention: Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrade instructions can prevent errors and data loss.

Apache Cassandra™ version change

SSTables must be upgraded for DataStax Enterprise upgrades between versions that include major Cassandra version changes.
  • DataStax Enterprise 6.7 is compatible with Cassandra 3.11.
  • DataStax Enterprise 6.0 is compatible with Cassandra 3.11.
  • DataStax Enterprise 5.1 uses Cassandra 3.11.
  • DataStax Enterprise 5.0 uses Cassandra 3.0.
  • DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.
  • DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.

General recommendations

DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary.

Tip: OpsCenter provides a Backup Service that manages enterprise-wide backup and restore operations for DataStax Enterprise clusters. OpsCenter 6.5 and later is recommended.

Upgrade restrictions and limitations

Restrictions and limitations apply while a cluster is in a partially upgraded state.

With these exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

General restrictions and limitations during the upgrade process
  • Do not enable new features.
  • Do not run nodetool repair. If you have the OpsCenter Repair Service configured, turn off the Repair Service.
  • Ensure OpsCenter compatibility. OpsCenter 6.7 is required for managing DSE 6.7 clusters. See the compatibility table.
  • During the upgrade, do not bootstrap or decommission nodes.
  • Do not issue these types of CQL queries during a rolling restart: DDL and TRUNCATE.
  • Failure to upgrade SSTables results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
  • NodeSync waits to start until all nodes are upgraded.
Note: Nodes on different versions might show a schema disagreement during an upgrade.
Restrictions for DSE Analytic (Spark) nodes
  • Do not run analytics jobs until all nodes are upgraded.
  • All nodes in the cluster must be upgraded to the new version before Spark Worker and Spark Master will start.
DSE Graph nodes restrictions
Graph nodes have the same restrictions as the workload they run on. Do not alter graph schema during upgrades. Workload-specific restrictions apply for analytics and search nodes, such as no OLAP queries during upgrades.
DSE Search upgrade restrictions and limitations
  • Do not update schemas.
  • Do not reindex DSE Search nodes during upgrade.
Restrictions for nodes using any kind of security
  • Do not change security credentials or permissions until the upgrade is complete on all nodes.
  • If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.
Upgrading drivers and possible impact when driver versions are incompatible
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host that the driver connects to. To avoid driver version incompatibility during upgrades, use one of these workarounds:
  • Protocol version: Because some drivers can use different protocol versions, force the protocol version at start up. For example, keep the Java driver at its current protocol version while the driver upgrade is happening. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
  • Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest driver version. For example, the initial contact points contain only Java driver v2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you're using, for example, Java driver.

Preparing to upgrade

Follow these steps to prepare each node for upgrading from DSE 6.0 to DSE 6.7.
Note: These steps are performed in your current version and use DSE 6.0 documentation.
  1. Carefully review Planning your DataStax Enterprise upgrade.
  2. Before upgrading, be sure that each node has ample free disk space.

    The required space depends on the compaction strategy. See Disk space.

  3. Familiarize yourself with the changes and features in this release:
  4. Verify that your current product version is DSE 6.0.0 or later.
    dse -v
    These instructions are valid only for upgrades from DSE 6.0 to DSE 6.7.
  5. Upgrade to the latest patch release on your current version. The latest 6.0.x version of DSE is 6.0.10.

    Always upgrade to latest patch release on your current version before you upgrade to a higher version. Fixes included in the latest patch release might help or smooth the upgrade process.

  6. To prevent potential problems, upgrade the SSTables on each node to ensure that all SSTables are on the current version.
    nodetool upgradesstables

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

  7. Verify the Java runtime version and upgrade to the recommended version.
    java -version
    Important: Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8.
  8. Run nodetool repair to ensure that data on each replica is consistent with data on other nodes.
  9. Install the libaio package for optimal performance.
    RHEL platforms:
    sudo yum install libaio
    Debian:
    sudo apt-get install libaio1
  10. Back up the configuration files you use to a folder that is not in the directory where you normally run commands.

    The configuration files are overwritten with default values during installation of the new version.

Upgrade steps

To upgrade from DSE 6.0 to DSE 6.7, follow these steps on each node in the recommended order. The upgrade process requires upgrading and restarting one node at a time.
Note: These steps are performed in your upgraded version and use DSE 6.7 documentation.
  1. To flush the commit log of the old installation:
    nodetool -h hostname drain
    This step saves time when nodes start up after the upgrade and prevents DSE Search nodes from having to reindex data.
    Important: This step is mandatory when upgrading between major Cassandra versions that change SSTable formats, rendering commit logs from the previous version incompatible with the new version.
  2. Stop the node. See Stopping a DataStax Enterprise node.
    • To stop DataStax Enterprise running as a service:
      sudo service dse stop
    • To stop DataStax Enterprise running as a stand-alone process:
      bin/dse cassandra-stop
  3. Use the appropriate method to install the new product version on a supported platform:
    Note: Install the new product version using the same installation type that is on the system, otherwise problems might result.
  4. To configure the new version:
    1. Compare your backup configuration files to the new configuration files. Look for any deprecated, removed, or changed settings.
      • Review changes in cassandra.yaml and dse.yaml.

        After the upgrade and before restarting with 6.7.0, remove deprecated settings and use new settings.

        cassandra.yaml changes

        Memtable settings
        Deprecated cassandra.yaml settings
        memtable_heap_space_in_mb
        memtable_offheap_space_in_mb
        Replace with this setting
        memtable_space_in_mb

        Governs heap and offheap space allocation to set a threshold for automatic memtable flush. The calculated default is 1/4 of the heap size.

        Changed setting
        memtable_allocation_type: offheap_objects

        The default method the database uses to allocate and manage memtable memory is  offheap_objects.

        User-defined functions (UDF) settings
        Deprecated cassandra.yaml settings
        user_defined_function_warn_timeout
        user_defined_function_fail_timeout
        Replace with these settings
        user_defined_function_warn_micros: 500
        user_defined_function_fail_micros: 10000
        user_defined_function_warn_heap_mb: 200
        user_defined_function_fail_heap_mb: 500
        user_function_timeout_policy: die

        Settings are in microseconds since Java UDFs run faster. The new timeouts are not equivalent to the deprecated settings.

        Internode encryption settings
        Deprecated cassandra.yaml setting
        server_encryption_options:
            store_type: JKS
        Replace with these settings
        server_encryption_options:
            keystore_type: JKS
            truststore_type: JKS

        Valid type options are JKS, JCEKS, PKCS12, or PKCS11.

        Internode encryption settings
        Deprecated cassandra.yaml setting
        server_encryption_options:
            store_type: JKS
        Replace with these settings
        server_encryption_options:
            keystore_type: JKS
            truststore_type: JKS

        Valid type options are JKS, JCEKS, PKCS12, or PKCS11.

        Client-to-node encryption settings
        Deprecated cassandra.yaml setting
        client_encryption_options:
            store_type: JKS
        Replace with these settings
        client_encryption_options:
            keystore_type: JKS
            truststore_type: JKS

        Valid type options are JKS, JCEKS, PKCS12, or PKCS11.

        dse.yaml changes

        Spark resource and encryption options
        Deprecated dse.yaml setting
        spark_ui_options:
            server_encryption_options:
            store_type: JKS
        Replace with these settings
        spark_ui_options:
            server_encryption_options:
            keystore_type: JKS
            truststore_type: JKS

        Valid options are JKS, JCEKS, PKCS12, or PKCS11.

    2. Merge the applicable configuration file modifications into the new version.
  5. Ensure that keyspace replication factors are correct for your environment:
  6. When upgrading DSE to versions earlier than 5.1.16, 6.0.8, or 6.7.4 inclusive, if any tables are using DSE Tiered Storage, remove all txn_compaction log files from second-level tiers and lower. For example, given the following dse.yaml configuration, remove txn_compaction log files from /mnt2 and /mnt3 directories:
    tiered_storage_options:
        strategy1:
            tiers:
                - paths:
                    - /mnt1
                - paths:
                    - /mnt2
                - paths:
                    - /mnt3

    The following example removes the files using the find command:

    find /mnt2 -name "*_txn_compaction_*.log" -type f -delete &&
    find /mnt3 -name "*_txn_compaction_*.log" -type f -delete
    Warning: Failure to complete this step may result in data loss.
  7. Start the node.
  8. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:
    nodetool status
  9. Review the logs for warnings, errors, and exceptions.

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

  10. Repeat the upgrade on each node in the cluster following the recommended upgrade order.

After the upgrade

After all nodes are upgraded and running on DSE 6.7:

  1. If you use the OpsCenter Repair Service, turn on the Repair Service.
  2. Remove any previously installed JTS JAR files from the classpaths in your DSE installation. JTS (Java Topology Suite) is distributed with DSE 6.7.
  3. Spark Jobserver uses DSE custom version 8.0.4.45. Ensure that applications use the compatible Spark Jobserver API from the DataStax repository.
  4. DSE 6.7 introduces, and enables by default, the DSE Metrics Collector, a diagnostics information aggregator used to help facilitate DSE problem resolution. For more information on the DSE Metrics Collector, see DataStax Enterprise Metrics Collector.