Upgrading DataStax Enterprise 4.7 to 4.8

Upgrading SSTables is required for upgrades that contain major Apache Cassandra® releases:

  • DataStax Enterprise 6.7 is compatible with Cassandra 3.11.

  • DataStax Enterprise 6.0 is compatible with Cassandra 3.11.

  • DataStax Enterprise 5.1 uses Cassandra 3.11.

  • DataStax Enterprise 5.0 uses Cassandra 3.0.

  • DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.

  • DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.

Follow these instructions to upgrade from DataStax Enterprise 4.7 to 4.8.

DataStax is offering a complimentary half-day Upgrade Assessment. This assessment is a DataStax Services engagement designed to assess the upgrade compatibility of your existing DSE deployment to later DSE versions, including 5.1, 6.0, 6.7, and 6.8. Contact the DataStax Services team to schedule your assessment.

Always upgrade to latest patch release on your current version before you upgrade to a higher version. Fixes included in the latest patch release might help or smooth the upgrade process.

The latest version of DSE 4.7 is 4.7.9.

Read and understand these instructions before upgrading. Carefully reviewing the planning and upgrade instructions can prevent errors and data loss.

TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date that is greater than the maximum threshold of 2038-01-19T03:14:06+00:00, the data is immediately expired and purged on the next compaction. DataStax strongly recommends upgrading to DSE 4.8.16 and taking required action to protect against silent data loss. (DSP-15412).

General recommendations

DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary.

OpsCenter provides a Backup Service that manages enterprise-wide backup and restore operations for DataStax Enterprise clusters. OpsCenter 6.5 and later is recommended.

Upgrade restrictions and limitations

Restrictions and limitations apply while a cluster is in a partially upgraded state.

With these exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.

General upgrade restrictions

  • Do not enable new features.

  • Do not run nodetool repair.

  • During the upgrade, do not bootstrap or decommission nodes.

  • Do not issue these or related types of CQL queries during a rolling restart: DDL or TRUNCATE.

  • During the upgrade, the nodes on different versions might show a schema disagreement.

  • Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.

Restrictions for DSE Analytic (Spark) nodes

  • Do not run analytics jobs until all nodes are upgraded.

  • Kill all Spark worker processes before you stop the node and install the new version.

Restrictions for DSE Search (Solr) nodes

  • Do not update schemas.

  • Do not reindex DSE Search nodes during upgrade.

  • Do not issue these types of queries during a rolling restart: BATCH or TRUNCATE.

  • During the upgrade process on a cluster with mixed versions where DataStax Enterprise 4.7 or 4.8 supports pagination and earlier versions do not, issuing queries from the upgraded nodes return only FetchSize results.

Restrictions for nodes using any kind of security

  • Do not change security credentials or permissions until after the upgrade is complete.

DataStax Enterprise and Apache Cassandra configuration files

DataStax Enterprise (DSE) configuration files
Configuration file Installer-Services and package installations Installer-No Services and tarball installations

dse

/etc/default/dse (systemd) or /etc/init.d/ (SystemV)

N/A Node type is set via command line flags.

dse-env.sh

/etc/dse/dse-env.sh

<installation_location>/bin/dse-env.sh

byoh-env.sh

/etc/dse/byoh-env.sh

<installation_location>/bin/byoh-env.sh

dse.yaml

/etc/dse/dse.yaml

<installation_location>/resources/dse/conf/dse.yaml

logback.xml

/etc/dse/cassandra/logback.xml

<installation_location>/resources/logback.xml

spark-env.sh

/etc/dse/spark/spark-env.sh

<installation_location>/resources/spark/conf/spark-env.sh

spark-defaults.conf

/etc/dse/spark/spark-defaults.conf

<installation_location>/resources/spark/conf/spark-defaults.conf

Cassandra configuration files

Configuration file

Installer-Services and package installations

Installer-No Services and tarball installations

cassandra.yaml

/etc/dse/cassandra/cassandra.yaml

<installation_location>/conf/cassandra.yaml

cassandra.in.sh

/usr/share/cassandra/cassandra.in.sh

<installation_location>/bin/cassandra.in.sh

cassandra-env.sh

/etc/dse/cassandra/cassandra-env.sh

<installation_location>/conf/cassandra-env.sh

cassandra-rackdc.properties

/etc/dse/cassandra/cassandra-rackdc.properties

<installation_location>/conf/cassandra-rackdc.properties

cassandra-topology.properties

/etc/dse/cassandra/cassandra-topology.properties

<installation_location>/conf/cassandra-topology.properties

jmxremote.password

/etc/cassandra/jmxremote.password

<installation_location>/conf/jmxremote.password

Tomcat server configuration file
Configuration file Installer-Services and package installations Installer-No Services and tarball installations

server.xml

/etc/dse/resources/tomcat/conf/server.xml

<installation_location>/resources/tomcat/conf/server.xml

Preparing to upgrade from 4.7 to 4.8

Follow these steps to prepare to upgrade from DataStax Enterprise 4.7 to 4.8.

  1. Before upgrading, be sure that each node has adequate free disk space. The required space depends on the compaction strategy. See Disk space.

  2. Familiarize yourself with the changes and features in this release:

  3. Verify your current product version is the latest patch for DataStax Enterprise 4.7. The latest version is 4.7.9.

  4. Upgrade the SSTables on each node to ensure that all SSTables are on the current version. This is required for DataStax Enterprise upgrades that include a major Cassandra version changes.

    Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.

    nodetool upgradesstables

    If the SSTables are already on the current version, the command returns immediately and no action is taken. See SSTable compatibility and upgrade version.

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads.

    Failure to upgrade SSTables results in a significant performance impact and increased disk usage.

    For information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ.

  5. Verify the Java runtime version and upgrade to the recommended version.

    java -version

    The latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended. The JDK is recommended for development and production systems. The JDK provides useful troubleshooting tools that are not in the JRE, such as jstack, jmap, jps, and jstat.

    If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40.

  6. Run nodetool repair to ensure that data on each replica is consistent with data on other nodes.

  7. DSE Search nodes: All unique key elements must be indexed in the Solr schema.

    1. To verify unique key elements, review schema.xml to ensure that all unique key fields must have indexed=true. If required, make changes to schema.xml and reindex.

  8. Back up the configuration files that you use to a folder that is not in the directory where you normally run commands.

    1. The configuration files are overwritten with default values during installation of the new version.

Steps for upgrading from 4.7 to 4.8

The DataStax installer upgrades DataStax Enterprise and automatically performs many upgrade tasks.

The upgrade process for DataStax Enterprise (DSE) provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DSE until all of the nodes in the cluster are upgraded.

Upgrade and restart the nodes one at a time.

Upgrade nodes in this order:

  1. In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.

  2. Upgrade the seed nodes within a datacenter first.

  3. DSE Analytics datacenters

    1. For DSE Analytics nodes using DSE Hadoop, upgrade the Job Tracker node first. Then upgrade Hadoop nodes, followed by Spark nodes.

  4. Transactional/DSE Graph datacenters

  5. DSE Search nodes or datacenters

  1. For DSE Analytics nodes: Kill all Spark worker processes.

  2. For DSE Search nodes: Review these considerations and take appropriate actions:

    To maintain 4.6 query behavior:

    Disable driver pagination by editing the dse.yaml file and setting cql_solr_query_paging: off. DataStax Enterprise 4.7 or 4.8 integrates native driver paging with Solr cursor-based paging (4.7 or 4.8). You can turn on paging after you verify the upgrade.

  3. To flush the commit log of the old installation:

    nodetool -h hostname drain

    This step saves time when nodes start up after the upgrade, and prevents DSE Search nodes from having to reindex data.

    This step is mandatory when upgrading between major Cassandra versions that change SSTable formats, rendering commit logs from the previous version incompatible with the new version.

  4. Stop the node.

  5. Use the appropriate installation type to install the new product version on a supported platform:

Install the new product version using the same installation type that is on the system. The upgrade proceeds with installation regardless of the installation type. If you use a different installation type, the upgrade might result in issues.

  1. To configure the new product version:

    1. Compare your backup configuration files to the new configuration files:

      • Look for any deprecated, removed, or changed settings.

      • Be sure you are familiar with the Apache Cassandra and DataStax Enterprise changes and features in the new release.

      • Ensure that keyspace replication factors are correct for your environment:

    2. Merge the applicable modifications into the new version.

  2. Start the node.

  3. Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:

    nodetool status
  4. Review the logs for warnings, errors, and exceptions.

    Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps.

If you find unexpected warnings, errors, or exceptions, contact DataStax Support.

  1. In DataStax Enterprise 4.8, audit log tables use DateTieredCompactionStrategy (DTCS). DataStax recommends changing tables that were created in earlier releases to use DTCS:

    DTCS: ALTER TABLE dse_audit.audit_log WITH COMPACTION={'class':'DateTieredCompactionStrategy'};
  2. Repeat the upgrade on each node in the cluster following the recommended order.

  3. After the new version is installed, upgrade the SSTables on the upgraded nodes.

    This is recommended for optimal performance, but is not required.

    nodetool upgradesstables

    If the SSTables are already on the current version, the command returns immediately and no action is taken.

    Use the --jobs option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads.

For information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com