Upgrading Apache Cassandra™ to DDAC
Upgrading Apache Cassandra™ to DataStax Distribution of Apache Cassandra™.
DataStax driver changes
DataStax drivers come in two types:
- DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
- DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
dse-env.sh
The default location of the dse-env.sh file depends on the type of installation:Package installations | /etc/dse/dse-env.sh |
Tarball installations | installation_location/bin/dse-env.sh |
DataStax Enterprise and Apache Cassandra™ configuration files
Configuration file | Installer-Services and package installations | Installer-No Services and tarball installations |
---|---|---|
DataStax Enterprise configuration files | ||
dse | /etc/default/dse (systemd) or /etc/init.d/ (SystemV) | N/A. Node type is set via command line flags. |
byoh-env.sh | /etc/dse/byoh-env.sh | install_location/bin/byoh-env.sh |
dse.yaml | /etc/dse/dse.yaml | install_location/resources/dse/conf/dse.yaml |
logback.xml | /etc/dse/cassandra/logback.xml | install_location/resources/logback.xml |
spark-env.sh | /etc/dse/spark/spark-env.sh | install_location/resources/spark/conf/spark-env.sh |
spark-defaults.conf | /etc/dse/spark/spark-defaults.conf | install_location/resources/spark/conf/spark-defaults.conf |
Cassandra configuration files | ||
cassandra.yaml | /etc/dse/cassandra/cassandra.yaml | install_location/conf/cassandra.yaml |
cassandra.in.sh | /usr/share/cassandra/cassandra.in.sh | install_location/bin/cassandra.in.sh |
cassandra-env.sh | /etc/dse/cassandra/cassandra-env.sh | install_location/conf/cassandra-env.sh |
cassandra-rackdc.properties | /etc/dse/cassandra/cassandra-rackdc.properties | install_location/conf/cassandra-rackdc.properties |
cassandra-topology.properties | /etc/dse/cassandra/cassandra-topology.properties | install_location/conf/cassandra-topology.properties |
jmxremote.password | /etc/cassandra/jmxremote.password | install_location/conf/jmxremote.password |
Tomcat server configuration file | ||
server.xml | /etc/dse/resources/tomcat/conf/server.xml | install_location/resources/tomcat/conf/server.xml |
Upgrade order
Upgrade nodes in this order:- In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
- Upgrade the seed nodes within a datacenter first.
- Upgrade nodes in this order:
- DSE Analytics datacenters
- Transactional/DSE Graph datacenters
- DSE Search datacenters
Back up your existing installation
A backup provides the ability to revert and restore all the data used in the previous version if necessary. For manual backup instructions, see Backing up and restoring DSE.
Upgrade SSTables
Upgrade restrictions and limitations
General restrictions
- Do not enable new features.
- Do not run nodetool repair.
- During the upgrade, do not bootstrap new nodes or decommission existing nodes.
- Complete the cluster-wide upgrade before the expiration of
gc_grace_seconds
(approximately 13 days) to ensure any repairs complete successfully. - Do not issue TRUNCATE or DDL related queries during the upgrade process.
Restrictions for nodes using security
- Do not change security credentials or permissions until the upgrade is complete on all nodes.
- If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.
Driver version impacts
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
- Protocol version: Set the protocol version explicitly in your application at start up. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
- Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest DSE version or protocol version. For example, the initial contact points contain only protocol version 2.
Upgrade paths
Upgrade from Apache Cassandra™ | Required Cassandra interim upgrade version |
---|---|
Cassandra 3.0 and 3.11 | None |
Cassandra 2.2 | Upgrade to the latest version of 2.2 before upgrading to DDAC |
Cassandra 2.1 and 2.0 | Upgrade to the latest version of 2.1 before upgrading to DDAC |
Cassandra 1.2 and earlier | Upgrade to the latest 1.2 version, then to the latest version of 2.0, and then to the latest version of 2.1 before upgrading to DDAC |
Preparing to upgrade
Follow these steps to prepare each node for the upgrade:
- Familiarize yourself with the changes and features in the new release:
- DDAC release notes for 5.1 (equivalent to Cassandra 3.11).
- General upgrade advice and Cassandra features in NEWS.txt.
- Cassandra changes in CHANGES.txt.
- Before upgrading, be sure that each node has adequate free disk
space. Determine current DSE data disk space usage:
sudo du -sh /var/lib/cassandra/data/ 3.9G /var/lib/cassandra/data/
Determine available disk space:sudo df -hT / Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 ext4 59G 16G 41G 28% /
Important: The required space depends on the compaction strategy. See Disk space - Upgrade the SSTables on each node to ensure that all
SSTables are on the current version:
nodetool upgradesstables
Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.Tip: Use the--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.If the SSTables are already on the current version, the command returns immediately and no action is taken.
- Verify the Java runtime version and upgrade to the recommended version.
java -version openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
- Recommended: OpenJDK 8 (1.8.0_151 minimum)
Note: Recommendation changed due to the end of public updates for Oracle JRE/JDK 8. See Oracle Java SE Support Roadmap.
- Supported: Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum)
Important: Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8. - Recommended: OpenJDK 8 (1.8.0_151 minimum)
- Run nodetool repair to ensure that data on each
replica is consistent with data on other
nodes:
nodetool repair -pr
- Install the libaio package for optimal performance. RHEL platforms:
sudo yum install libaio
Debian:sudo apt-get install libaio1
- Back up any customized
configuration files since they may be overwritten with
default values during installation of the new version. Tip: If you backed up your installation using instructions in Backing up and restoring DSE, your original configuration files are included in the archive.
Upgrade steps
- Flush the commit log of the current
installation:
nodetool drain
- Stop the node. (2.1, 2.2, 3.0)
- Uninstall Cassandra.Note: If you installed Cassandra from packages in APT or RPM repositories, you must remove the packages before setting up and installing DDAC.
- For packages installed from APT
repositories:
sudo apt-get autoremove "dsc*" "cassandra*" "apache-cassandra*"
This action shuts down Cassandra if it is still running.
- For packages installed from Yum
repositories:
sudo yum remove "dsc*" "cassandra*" "apache-cassandra*"
The old Cassandra configuration file might be renamed to cassandra.yaml.rpmsave, for example:
warning: /etc/cassandra/default.conf/cassandra.yaml saved as /etc/cassandra/default.conf/cassandra.yaml.rpmsave
- If Cassandra was installed with a binary
tarball:
ps auwx | grep cassandra
sudo kill cassandra_pid
And then remove the Cassandra installation directory.
- For packages installed from APT
repositories:
- Install DDAC.
-
To configure the new product version:
-
Compare changes in the new configuration files with the backup configuration files after the upgrade but before restarting, remove deprecated settings, and update any new settings if required.Warning: Do not simply replace new configuration files with old. Rather compare your old files to the new files and make any required changes.Tip: Use the DSE yaml_diff tool to compare backup YAML files with the upgraded YAML files:
cd /usr/share/dse/tools/yamls
./yaml_diff path/to/yaml-file-old path/to/yaml-file-new ... CHANGES ========= authenticator: - AllowAllAuthenticator + com.datastax.bdp.cassandra.auth.DseAuthenticator authorizer: - AllowAllAuthorizer + com.datastax.bdp.cassandra.auth.DseAuthorizer roles_validity_in_ms: - 2000 + 120000 ...
-
- If upgrading from Cassandra 3.11.2 or later, comment out the following parameters in
cassandra.yaml if they exist:
- enable_materialized_views
- enable_sasi_indexes
Tip: See configuration for the location of Cassandra configuration files. - Start the node:
sudo bin/cassandra
Tip: See DataStax Distribution of Apache Cassandra 3.11 start-up parameters for additional startup options. - Verify that the upgraded datacenter names match the
datacenter names in the keyspace schema definition:
- Get the node's datacenter
name:
nodetool status | grep "Datacenter" Datacenter: datacenter-name
- Verify that the node's datacenter name matches the datacenter name for a
keyspace:
cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'datacenter-name': '3'};
- Get the node's datacenter
name:
- Review the logs for warnings, errors, and
exceptions:
grep -w 'WARNING\|ERROR\|exception' /var/log/cassandra/*.log
Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.Tip: Non standard log locations are configured in dse-env.sh. - Run nodetool
repair:
bin/nodetool repair -pr
Important: Be sure to run nodetool repair on each node in the datacenter. - Repeat the upgrade process on each node in the cluster following the recommended order.
- After the entire cluster upgrade is complete: upgrade
the SSTables on one node at a time or, when using racks, one rack at a time.Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage and possible data loss. Upgrading is not complete until the SSTables are upgraded.
nodetool upgradesstables
Tip: Use the--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.Important: You can run theupgradesstables
command before all the nodes are upgraded as long as you run the command on only one node at a time or when using racks, one rack at a time. Runningupgradesstables
on too many nodes at once will degrade performance.