Upgrading to DataStax Enterprise 4.7 or 4.8
Instructions to upgrade to DataStax Enterprise 4.7 or 4.8 from DataStax Enterprise 4.0, 4.5, or 4.6.
DataStax Enterprise and Apache Cassandra™ configuration files
Configuration file | Installer-Services and package installations | Installer-No Services and tarball installations |
---|---|---|
DataStax Enterprise configuration files | ||
dse | /etc/default/dse (systemd) or /etc/init.d/ (SystemV) | N/A. Node type is set via command line flags. |
byoh-env.sh | /etc/dse/byoh-env.sh | install_location/bin/byoh-env.sh |
dse.yaml | /etc/dse/dse.yaml | install_location/resources/dse/conf/dse.yaml |
logback.xml | /etc/dse/cassandra/logback.xml | install_location/resources/logback.xml |
spark-env.sh | /etc/dse/spark/spark-env.sh | install_location/resources/spark/conf/spark-env.sh |
spark-defaults.conf | /etc/dse/spark/spark-defaults.conf | install_location/resources/spark/conf/spark-defaults.conf |
Cassandra configuration files | ||
cassandra.yaml | /etc/dse/cassandra/cassandra.yaml | install_location/conf/cassandra.yaml |
cassandra.in.sh | /usr/share/cassandra/cassandra.in.sh | install_location/bin/cassandra.in.sh |
cassandra-env.sh | /etc/dse/cassandra/cassandra-env.sh | install_location/conf/cassandra-env.sh |
cassandra-rackdc.properties | /etc/dse/cassandra/cassandra-rackdc.properties | install_location/conf/cassandra-rackdc.properties |
cassandra-topology.properties | /etc/dse/cassandra/cassandra-topology.properties | install_location/conf/cassandra-topology.properties |
jmxremote.password | /etc/cassandra/jmxremote.password | install_location/conf/jmxremote.password |
Tomcat server configuration file | ||
server.xml | /etc/dse/resources/tomcat/conf/server.xml | install_location/resources/tomcat/conf/server.xml |
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:
Package installations |
/etc/cassandra/cassandra.yaml |
Tarball installations |
install_location/conf/cassandra.yaml |
Upgrading major Cassandra version
Upgrading SSTables is required for upgrades that contain major Apache Cassandra releases:- DataStax Enterprise 6.7 is compatible with Cassandra 3.11.
- DataStax Enterprise 6.0 is compatible with Cassandra 3.11.
- DataStax Enterprise 5.1 uses Cassandra 3.11.
- DataStax Enterprise 5.0 uses Cassandra 3.0.
- DataStax Enterprise 4.7 to 4.8 use Cassandra 2.1.
- DataStax Enterprise 4.0 to 4.6 use Cassandra 2.0.
Upgrade order
Upgrade nodes in this order:- In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
- Upgrade the seed nodes within a datacenter first.
- Upgrade nodes in this order:
- DSE Analytics datacenters
- Transactional/DSE Graph datacenters
- DSE Search datacenters
Follow these instructions to upgrade from DataStax Enterprise 4.0, 4.5, and 4.6 to DataStax Enterprise 4.7 or 4.8.
Cassandra version change
Upgrading 4.0, 4.5, and 4.6 to DataStax Enterprise 4.7 or 4.8 includes a major Cassandra version change. Be sure to follow the recommendations for upgrading the SSTables.
- A Cassandra version change from 2.0 to 2.1, and changes the default value of the commitlog_total_space_in_mb value in cassandra.yaml from 1024 MB to 8192 MB. Adjust the commitlog_total_space_in_mb setting for your environment to ensure that you do not run out of disk space after upgrade.
- Logging is changed from log4j to logback, with changes to logging retention policies. Configure the logger by setting options in logback.xml. See Configuring logging.
General recommendations
DataStax recommends backing up your data prior to any version upgrade, including logs and custom configurations. A backup provides the ability to revert and restore all the data used in the previous version if necessary.
Upgrade restrictions and limitations
Restrictions and limitations apply while a cluster is in a partially upgraded state.
With these exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.
- General upgrade restrictions
-
- Do not enable new features.
- Do not run nodetool repair.
- During the upgrade, do not bootstrap or decommission nodes.
- Do not issue these types of CQL queries during a rolling
restart:
DDL
andTRUNCATE
. - During the upgrade, the nodes on different versions might show a schema disagreement.
- Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.
- Restrictions for DSE Analytic (Spark) nodes
-
- Do not run analytics jobs until all nodes are upgraded.
- Kill all Spark worker processes before you stop the node and install the new version.
- Restrictions for DSE Search (Solr) nodes
-
- Do not update schemas.
- Do not reindex DSE Search nodes during upgrade.
- Do not issue these types of queries during a rolling restart:
BATCH
orTRUNCATE
. - During the upgrade process on a cluster with mixed versions where DataStax Enterprise 4.7 or 4.8 supports pagination and earlier versions do not, issuing queries from the upgraded nodes will return only FetchSize results.
- Restrictions for nodes using any kind of security
-
- Do not change security credentials or permissions until after the upgrade is complete.
Preparing to upgrade from 4.0, 4.5, or 4.6 to 4.7 or 4.8
- Before upgrading, be sure that each node has adequate free
disk space.
The required space depends on the compaction strategy. See Disk space.
- Familiarize yourself with the changes and features in
this release:
- DataStax Enterprise release notes for 4.7 and 4.8.
- General upgrading advice for any version and New features for Apache Cassandra™ 2.1 in NEWS.txt. Be sure to read the NEWS.txt all the way back to your current version.
- Apache Cassandra™ changes in CHANGES.txt.
- DataStax Enterprise 4.7 or 4.8 production-certified changes to Apache Cassandra.
- Verify your current product version. If necessary, upgrade to one these required
interim versions before upgrading to 4.7 or 4.8:
- DataStax Enterprise 4.0 and later
- DataStax Community or open source Apache Cassandra™ 2.0.x
- Upgrade the SSTables on each node to ensure that all
SSTables are on the current version. This is required for DataStax Enterprise upgrades that include a major Cassandra version changes.Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.
nodetool upgradesstables
If the SSTables are already on the current version, the command returns immediately and no action is taken. See SSTable compatibility and upgrade version.
Use the
--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. For information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ.Warning: Failure to upgrade SSTables results in a significant performance impact and increased disk usage. For for information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ. - Verify the Java runtime version and upgrade to the
recommended version.
java -version
The latest version of Oracle Java SE Runtime Environment 7 or 8 or OpenJDK 7 is recommended. The JDK is recommended for development and production systems. The JDK provides useful troubleshooting tools that are not in the JRE, such as jstack, jmap, jps, and jstat.Note: If using Oracle Java 7, you must use at least 1.7.0_25. If using Oracle Java 8, you must use at least 1.8.0_40. - Run nodetool repair to ensure that data on each replica is consistent with data on other nodes.
- DSE Search nodes:
All unique key elements must be indexed in the Solr schema. To verify unique key elements, review schema.xml to ensure that all unique key fields must have indexed=true.
If required, make changes to schema.xml and reload the Solr core.
- Back up the configuration files you use to a folder
that is not in the directory where you normally run commands.
The configuration files are overwritten with default values during installation of the new version.
Steps for upgrading from 4.0, 4.5, or 4.6 to 4.7 or 4.8
The upgrade process for DataStax Enterprise provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.
- Upgrade order matters. Upgrade nodes in this order:
- In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
- Upgrade the seed nodes within a datacenter first.
For DSE Analytics nodes using DSE Hadoop, upgrade the Job Tracker node first. Then upgrade Hadoop nodes, followed by Spark nodes.
- Upgrade nodes in this order:
- DSE Analytics datacenters
- Transactional/DSE Graph datacenters
- DSE Search datacenters
- DSE Analytics nodes: Kill all Spark worker processes.
- DSE Search nodes: Review these considerations and take appropriate actions:
- If your schema.xml contains
fieldTypes
usingdocValuesFormat="Disk"
, you must modify the file to remove thedocValuesFormat
attribute, reload, and optimize your index to rewrite to the default codec. This a requirement for Solr 4.10 and above. - To maintain 4.6 query behavior:
Disable driver pagination by editing the dse.yaml file and setting
cql_solr_query_paging: off
. DataStax Enterprise 4.7 or 4.8 integrates native driver paging with Solr cursor-based paging (4.7, 4.8). You can turn on paging after you verify the upgrade. - For upgrades from 4.0.0: See Special steps for upgrades from DataStax Enterprise 4.0.0 for special instructions.
- If your schema.xml contains
- To flush the commit log of the old
installation:
nodetool -h hostname drain
This step saves time when nodes start up after the upgrade, and prevents DSE Search nodes from having to reindex data.Important: This step is mandatory when upgrading between major Cassandra versions that change SSTable formats, rendering commit logs from the previous version incompatible with the new version. - Stop the node (4.7, 4.8).
- Use
the appropriate installation type to install the new product version on a supported platform:Note: Install the new product version using the same installation type that is on the system. The upgrade proceeds with installation regardless of the installation type. If you use a different installation type, the upgrade might result in issues.
- To configure the new product version:
- Compare your backup configuration files to the
new configuration files:
- Look for any deprecated, removed, or changed settings.
- Be sure you are familiar with the Apache Cassandra and DataStax Enterprise changes and features in the new release.
- Merge the applicable modifications into the new version.
- Compare your backup configuration files to the
new configuration files:
- Start the node.
- Verify
that the upgraded datacenter names match the datacenter names in the keyspace
schema
definition:
nodetool status
- Review
the logs for warnings, errors, and exceptions.
Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.
During upgrade of DSE Analytics nodes, exceptions about the Task Tracker are logged in the nodes that are not yet upgraded to 4.7 or 4.8. The jobs succeed after the entire cluster is upgraded.
Because DataStax Enterprise 4.7 and 4.8 use Cassandra 2.1, the output.log includes the following warnings:- Deprecated cassandra.yaml options are removed
- multithreaded_compaction
- memtable_flush_queue_size
- compaction_preheat_key_cache
- in_memory_compaction_limit_in_mb
- preheat_kernel_page_cache
- cassandra-env.sh change
JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar"
to
JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.3.0.jar"
- Deprecated cassandra.yaml options are removed
- In DataStax Enterprise 4.8, audit log tables use
DateTieredCompactionStrategy (DTCS). DataStax recommends changing tables that were
created in earlier releases to use DTCS:
ALTER TABLE dse_audit.audit_log WITH COMPACTION={'class':'DateTieredCompactionStrategy'};
- Repeat the upgrade on each node in the cluster following the recommended order.
- If existing tables use the DSE In-Memory option:
- Turn off SSTable
compression:
ALTER TABLE <tablename> WITH compression = {'sstable_compression' : ''} ;
- Rewrite existing SSTables without
compression:
nodetool upgradesstables -a <keyspacename> <tablename>
Use the
--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. For information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ.
- Turn off SSTable
compression:
- Upgrade the SSTables on the remaining nodes:
nodetool upgradesstables
Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage. Upgrading is not complete until the SSTables are upgraded.Use the
--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. For information about nodetool upgradesstables, including how to speed it up, see the DataStax Support KB article Nodetool upgradesstables FAQ.