Upgrading DataStax Enterprise 5.0 to 5.1
The upgrade process for DataStax Enterprise (DSE) (DSE) provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise (DSE) until all of the nodes in the cluster are upgraded.
Back Up Your Existing Installation
DataStax recommends backing up your data prior to any version upgrade. |
A backup provides the ability to revert and restore all the data used in the previous version if necessary. For manual backup instructions, see Backing up a tarball installation or Backing up a package installation.
OpsCenter provides a Backup Service that manages enterprise-wide backup and restore operations for DataStax Enterprise clusters and is highly recommended over any manual backup procedure. Ensure that you use a compatible version of OpsCenter for your DSE version. |
Upgrade SSTables
Be certain to upgrade SSTables on your nodes both before and after upgrading. Failure to upgrade SSTables results in severe performance penalties and possible data loss. |
Version-Specific Notes
DSE Search changes: As of DSE 5.1.17, unbounded facet searches are no longer allowed using |
Upgrade Restrictions and Limitations
Restrictions and limitations apply while a cluster is in a partially upgraded state. The cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.
General Restrictions
-
Do not enable new features.
-
Ensure OpsCenter compatibility.
Compatibility OpsCenter version DSE version 6.8
6.8, 6.7, 6.0, 5.1
6.7
DSE 6.0
6.5
6.0, 5.1, 5.0 (EOL)
6.1
5.1, 5.0, 5.0 (EOL)
6.0
5.0 (EOL), 4.8 (EOSL), 4.7 (EOSL)
-
Do not run
nodetool repair
. -
Stop the OpsCenter Repair Service if enabled: 6.8.
-
During the upgrade, do not bootstrap new nodes or decommission existing nodes.
-
Do not issue TRUNCATE or DDL related queries during the upgrade process.
-
Do not alter schemas for any workloads.
-
Complete the cluster-wide upgrade before the expiration of
gc_grace_seconds
(default 10 days) to ensure any repairs complete successfully. -
If the DSE Performance Service was disabled before the upgrade, do not enable it during the upgrade. See DSE Performance Service: 5.1 | OpsCenter 6.8.
Nodes on different versions might show a schema disagreement during an upgrade. |
Restrictions for Nodes Using Security
-
Do not change security credentials or permissions until the upgrade is complete on all nodes.
-
If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.
Restrictions for DSE Analytics Nodes
Spark versions change between major DSE versions. DSE release notes [5.x | 6.8.x] indicate which version of Spark is used.
When upgrading to a major version of DSE, all nodes in a DSE datacenter that run Spark must be on the same version of Spark and the Spark jobs must be compiled for that version. Each datacenter acting as a Spark cluster must be on the same upgraded DSE version before reinitiating Spark jobs.
In the case where Spark jobs run against Graph keyspaces, you must update all of the nodes in the cluster first to avoid Spark jobs failing.
Driver Version Impacts
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code.
DataStax drivers come in two types:
-
DataStax drivers for DataStax Enterprise (DSE) — for use by DSE 4.8 and later
-
DataStax drivers for Apache Cassandra® — for use by Apache Cassandra and DSE 4.7 and earlier
While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise (DSE) features.
During upgrades, you might experience driver-specific impact when clusters have mixed versions of drivers. If your cluster has mixed versions, the protocol version is negotiated with the first host to which the driver connects, although certain drivers, such as Java 4.x/2.x automatically select a protocol version that works across nodes. To avoid driver version incompatibility during upgrades, use one of these workarounds:
-
Protocol version: Set the protocol version explicitly in your application at start up. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
-
Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest DSE version or protocol version. For example, the initial contact points contain only protocol version 2.
For details on protocol version negotiation, see protocol versions with mixed clusters in the Java driver version you are using, for example, Java driver.
Starting January 2020, you can use the same DataStax driver for Apache Cassandra® (OSS) and DataStax Enterprise. DataStax has unified drivers to avoid user confusion and enhance the OSS drivers with some of the features in the DSE drivers. For more information, see the Better Drivers for Cassandra blog. |
Advanced Preparation for Upgrading DSE Search and SearchAnalytics nodes
Before continuing, complete all the advanced preparation steps on DSE Search and SearchAnalytics nodes while DSE 5.0 is still running.
Changes to DSE Search and DSE SearchAnalytics between version 5.0 and both versions 5.1 and 6.x are extensive. Plan sufficient time to implement and test the required changes before the upgrade. Contact the DataStax Support team with any questions or for help with upgrading. |
Schema changes may require a full reindex and configuration changes require reloading the core. |
Make the following changes as required:
-
Change HTTP API queries to CQL queries:
-
Delete-by-id is removed, use
CQL DELETE
by primary key instead. -
Delete-by-query no longer supports wildcards, use
CQL TRUNCATE
instead.
-
-
If any Solr core was created on DSE 4.6 or earlier and never reindexed after being upgraded to DSE 4.7 or later, you must reindex on DSE 5.0 before upgrading to DSE 5.1:
dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
You must reindex all nodes before beginning the upgrade.
-
If you are using Apache Solr SolrJ, the minimum required Solr version is 6.0.0. To find the current Solr version:
installation_directory/bin/solr status
For information on upgrading Apache Solr, see Upgrading Solr.
-
For
SpatialRecursivePrefixTreeFieldType
(RPT) in search schemas, you must adjust your queries for these changes:-
IsDisjointTo
is no longer supported in queries onSpatialRecursivePrefixTreeFieldType
. ReplaceIsDisjointTo
with aNOT Intersects
query. For example:foo:0,0 TO 1000,1000 AND -"Intersects(POLYGON((338 211, 338 305, 404 305, 404 211, 338 211)))")
-
The
ENVELOPE
syntax is now required for WKT-style queries againstSpatialRecursivePrefixTreeFieldType
fields. You must specifyENVELOPE(10, 15, 15, 10)
, where queries on earlier releases could specify10 10 15 15
. See Spatial Search for details on using distanceUnits in spatial queries.
-
-
Edit the
solrconfig.xml
file and make these changes, as needed:-
Remove these unsupported Solr requestHandlers:
-
XmlUpdateRequestHandler
-
BinaryUpdateRequestHandler
-
CSVRequestHandler
-
JsonUpdateRequestHandler
-
DataImportHandler
For example:
<requestHandler name="/dataimport" class="solr.DataImportHandler"/>
or
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler"/>
-
-
Change the
directoryFactory
from:<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
to
<directoryFactory name="DirectoryFactory" class="solr.StandardDirectoryFactory"/>
-
<unlockOnStartup>
is unsupported. -
Change the
updateLog
from:<updateLog class="solr.FSUpdateLog" force="false">
to
<updateLog force="false">
For more information on solrconfig.xml, see Configuring solrconfig.xml.
-
-
The
Circle
syntax is no longer a part of Well-Known-Text (WKT); therefore, Spatial Search queries such as:Intersects(Circle(10 10 d=2))
must be rewritten as:
Intersects(BUFFER(POINT(10 10), 2)
-
Upgrading DSE search nodes requires replacing unsupported Solr types with supported types.
Special handling is also required for
BCDStrField
, addressed in step 10.Sorting limitations apply to mixed version clusters. Some of the removed Solr types, due to the way they marshal sort values during distributed queries (combined with the way the suggested new types unmarshal sort values), cannot be sorted on during rolling upgrades when some nodes use an unsupported type and other nodes use the suggested new type. The following type transitions are problematic:
Removed Solr field types Supported Solr field types ByteField
TrieIntField
DateField
TrieDateField
BCDIntField
TrieIntField
BCDLongField
TrieLongField
Two options are available:
-
Avoid sorting on removed Solr field types until the upgrade is complete for all nodes in the datacenter being queried.
When using two search datacenters, isolate queries to a single datacenter and then change the schema and reindex the other datacenter. Then isolate queries to the newly reindexed datacenter while you change the schema and upgrade the first datacenter.
-
If you are using the
BCDIntField
or theBCDLongField
type, update the schema to replace them with types that are sort-compatible with the supported Solr typesTrieIntField
andTrieLongField
:[cols="1,1
-
Removed Solr field types | Interim sort-compatible supported Solr field types |
---|---|
BCDIntField |
SortableIntField |
BCDLongField |
SortableLongField |
+ Change the schema in a distributed fashion, and do not reindex. After the schema is updated on all nodes, then go on to step 9.
-
Update the schema and configuration for the Solr field types that are removed from Solr 5.5 and later.
-
Update the schema to replace unsupported Solr field types with supported Solr field types:
[cols="1,1
-
Removed Solr field types | Supported Solr field types |
---|---|
ByteField |
TrieIntField |
DateField |
TrieDateField |
DoubleField |
TrieDoubleField |
FloatField |
TrieFloatField |
IntField |
TrieIntField |
LongField |
TrieLongField |
ShortField |
TrieIntField |
SortableDoubleField |
TrieDoubleField |
SortableFloatField |
TrieFloatField |
SortableIntField |
TrieIntField |
SortableLongField |
TrieLongField |
BCDIntField |
TrieIntField |
BCDLongField |
TrieLongField |
BCDStrField (see upgrade data type, if used) |
TrieIntField |
-
If you are using type mapping version 0, or you do not specify a type mapper, verify or update the
solrconfig.xml
to usedseTypeMappingVersion 1
:<dseTypeMappingVersion>1</dseTypeMappingVersion>
If the Solr core is backed by a CQL table and the type mapping is unspecified, use type mapping version
2
.For more information on solrconfig.xml, see https://lucene.apache.org/solr/guide/7_6/configuring-solrconfig-xml.html.
-
Reload the core:
dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath
If you were using the unsupported data types, do a full reindex node-by-node:
dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
In DSE 5.1 and later, auto generated schemas use data type
mapper 2
.-
If using BCDStrField: In DSE 5.0 and earlier, DSE mapped Cassandra text columns to
BCDStrField
. The deprecatedBCDStrField
is removed.The recommended strategy is to upgrade the data type to
TrieIntField
. However, DSE cannot map text directly toTrieIntField
. If you are usingBCDStrField
, you must complete one of these options before the upgrade:-
If
BCDStrField
is no longer used, remove theBCDStrField
field from the Solr schema. Reindexing is not required. -
If you want to index the field as a
TrieIntField
, and a full reindex is acceptable, change the underlying database column to use the type int. -
If you want to keep the database column as text and you still want to do simple matching queries on the indexed field, switch from
BCDStrField
toStrField
in the schema. Indexing should not be required, but the field will no longer be appropriate for numeric range queries or sorting because StrField uses a lexicographic order, not a numeric one. -
Not recommended: If you want to keep the database column as text and still want to perform numeric range queries and sorts on the former
BCDStrField
, but would rather change their application than perform a full reindex:-
Change the field to
StrField
in the Solr schema withindexed=false
. -
Add a new copy field with the type
TrieIntField
that has its values supplied by the originalBCDStrField
. This solution still requires reindex to work, because the copy field target must be populated. This non-recommended option is supplied only to support a sub-optimal data model; for example, a text column with values that would fit only into an int.After you make these schema changes, do a rolling, node-by-node
reload_core
:dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
If you have two datacenters and upgrade them one at a time, reload the core with
distributed=true
anddeleteAll=true
.
-
-
-
Tune the schema before you upgrade. After the upgrade, all field definitions in the schema are validated and must be DSE Search compatible, even if the fields are not indexed, have
docValues
applied, or used for copy-field source. The default behavior of automatic resource generation includes all columns. To improve performance, take action to prevent the fields from being loaded from the database. Include only the required fields in the schema by removing or commenting out unused fields in the schema.
-
Advanced Preparation for Upgrading DSE Graph Nodes with Search Indexes
These steps apply to graph nodes that have search indexes. Before continuing, complete these advanced preparation steps while DSE 5.0 is still running.
Upgrading DSE Graph nodes with search indexes requires these edits to the solrconfig file. Configuration changes require reloading the core. Plan sufficient time to implement and test changes that are required before the upgrade. |
Edit solrconfig.xml
and make these changes, as needed:
-
Remove these unsupported Solr requestHandlers:
-
XmlUpdateRequestHandler
-
BinaryUpdateRequestHandler
-
CSVRequestHandler
-
JsonUpdateRequestHandler
-
DataImportHandler
For example:<requestHandler name="/dataimport" class="solr.DataImportHandler"/>
or
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler"/>
-
-
Remove
<unlockOnStartup>
. -
Reload the core:
dsetool reload_core keyspace_name.table_name reindex=false
Advanced Preparation for Upgrading DSE Analytics Nodes
Before upgrading DSE Analytics nodes:
-
DSE versions earlier than 5.1 use an older version of Spark and applications written using that version (1.6) may not be compatible with Spark 2.2. You must recompile all DSE 5.0 Scala Spark applications against Scala 2.11 and use only Scala 2.11 third-party libraries.
Changing the
dse-spark-dependencies
in your build files is not sufficient to change the compilation target. See the example projects for how to set up your build files. -
Spark applications should use
dse://
URLs instead ofspark://spark_master_IP:Spark_RPC_port_number
URLs, as described in Specifying Spark URLs. -
Modify calls to
setMaster
andsetAppName
.For example, the following code works in DSE 5.0 but will not work in DSE 5.1 or later.
val conf = new SparkConf(true) .setMaster("spark://192.168.123.10:7077") .setAppName("cassandra-demo") .set("cassandra.connection.host" , "192.168.123.10") // initial contact .set("cassandra.username", "cassandra") .set("cassandra.password", "cassandra") val sc = new SparkContext(conf)
To connect, modify the call to
setMaster
:val conf = new SparkConf(true) **.appName**("cassandra-demo") **.master**("dse://192.168.123.10:7077") .set("cassandra.connection.host" , "192.168.123.10") // initial contact .set("cassandra.username", "cassandra") .set("cassandra.password", "cassandra") val sc = new SparkContext(conf)
Location of Configuration Files
DataStax Enterprise and Apache Cassandra configuration files
Configuration file | Installer-Services and package installations | Installer-No Services and tarball installations |
---|---|---|
dse |
/etc/default/dse (systemd) or /etc/init.d/ (SystemV) |
N/A Node type is set via command line flags. |
dse-env.sh |
/etc/dse/dse-env.sh |
<installation_location>/bin/dse-env.sh |
byoh-env.sh |
/etc/dse/byoh-env.sh |
<installation_location>/bin/byoh-env.sh |
dse.yaml |
/etc/dse/dse.yaml |
<installation_location>/resources/dse/conf/dse.yaml |
logback.xml |
/etc/dse/cassandra/logback.xml |
<installation_location>/resources/logback.xml |
spark-env.sh |
/etc/dse/spark/spark-env.sh |
<installation_location>/resources/spark/conf/spark-env.sh |
spark-defaults.conf |
/etc/dse/spark/spark-defaults.conf |
<installation_location>/resources/spark/conf/spark-defaults.conf |
Configuration file |
Installer-Services and package installations |
Installer-No Services and tarball installations |
cassandra.yaml |
/etc/dse/cassandra/cassandra.yaml |
<installation_location>/conf/cassandra.yaml |
cassandra.in.sh |
/usr/share/cassandra/cassandra.in.sh |
<installation_location>/bin/cassandra.in.sh |
cassandra-env.sh |
/etc/dse/cassandra/cassandra-env.sh |
<installation_location>/conf/cassandra-env.sh |
cassandra-rackdc.properties |
/etc/dse/cassandra/cassandra-rackdc.properties |
<installation_location>/conf/cassandra-rackdc.properties |
cassandra-topology.properties |
/etc/dse/cassandra/cassandra-topology.properties |
<installation_location>/conf/cassandra-topology.properties |
jmxremote.password |
/etc/cassandra/jmxremote.password |
<installation_location>/conf/jmxremote.password |
Configuration file | Installer-Services and package installations | Installer-No Services and tarball installations |
---|---|---|
server.xml |
/etc/dse/resources/tomcat/conf/server.xml |
<installation_location>/resources/tomcat/conf/server.xml |
Preparing to Upgrade
Follow these steps to prepare each node for the upgrade:
These steps are performed in your current version and use DSE 5.0 documentation. |
-
Upgrade to the latest patch release on your current version. Fixes included in the latest patch release can simplify the upgrade process.
Get the current DSE version:
bin/dse -v current_dse_version
-
Familiarize yourself with the changes and features in the new release:
-
DataStax Enterprise 5.1 release notes.
-
General upgrading advice for any version. Be sure to read NEWS.txt all the way back to your current version.
-
DataStax Enterprise changes in CHANGES.txt.
-
DataStax driver changes.
DataStax drivers come in two types:
-
DataStax drivers for DataStax Enterprise (DSE) — for use by DSE 4.8 and later
-
DataStax drivers for Apache Cassandra — for use by Apache Cassandra and DSE 4.7 and earlier
While the DataStax drivers for Apache Cassandra drivers can connect to DSE 5.0 and later clusters, DataStax strongly recommends upgrading to the DSE drivers. The DSE drivers provide functionality for all DataStax Enterprise (DSE) features.
-
-
-
Before upgrading, be sure that each node has adequate free disk space.
Determine current DSE data disk space usage:
sudo du -sh /var/lib/cassandra/data/ 3.9G /var/lib/cassandra/data/
Determine available disk space:
sudo df -hT / Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 ext4 59G 16G 41G 28% /
The required space depends on the compaction strategy. See Disk space.
-
Upgrade the SSTables on each node to ensure that all SSTables are on the current version:
nodetool upgradesstables
Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.
Use the
--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is2
, which minimizes impact on the cluster. Set to0
to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.If the SSTables are already on the current version, the command returns immediately and no action is taken.
-
Ensure that keyspace replication factors are correct for your environment:
cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'replication_factor': '3'} AND durable_writes = true;
-
Check the keyspace replication factor for analytics keyspaces.
-
Check the keyspace replication factor for
system_auth
anddse_security
keyspaces.
-
-
Verify the Java runtime version and upgrade to the recommended version.
java -version openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
-
Recommended: OpenJDK 8 (1.8.0_151 minimum)
Recommendation changed due to the end of public updates for Oracle JRE/JDK 8. See Oracle Java SE Support Roadmap.
-
Supported: Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum)
Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8.
-
-
Run nodetool repair to ensure that data on each replica is consistent with data on other nodes:
nodetool repair -pr
-
Install the
libaio
package for optimal performance.RHEL platforms:
sudo yum install libaio
Debian:
sudo apt-get install libaio1
-
Back up any customized configuration files since they may be overwritten with default values during installation of the new version.
If you backed up your installation using the instructions in Backing up a tarball installation or Backing up a package installation, your original configuration files are included in the archive.
-
Upgrades from 5.0.0 to 5.0.8 and from DSE 5.1.0 and 5.1.1 to DSE 5.1.2 and later releases:
Restart the node with this start-up parameter:
-Dcassandra.force_3_0_protocol_version=true
For example:
installation_location/bin/dse cassandra -Dcassandra.force_3_0_protocol_version=true
While mixed versions exist during the upgrade, do not add or remove columns from existing tables.
After the restart is complete, remove the flag.
Upgrade Steps
Follow these steps on each node in the recommended order. The upgrade process requires upgrading and restarting one node at a time.
These steps are performed in your upgraded version and use DSE 5.1 documentation. |
The DataStax installer upgrades DataStax Enterprise and automatically performs many upgrade tasks. |
-
Flush the commit log of the current installation:
nodetool drain
-
DSE Analytics nodes only: Kill all Spark worker processes:
for pid in $(jps | grep Worker | awk '{print $1}'); do kill -9 $pid; done
-
Stop the node:
-
Package installations:
sudo service dse stop
-
Tarball installations:
installation_dir/bin/dse cassandra-stop
-
-
Use the appropriate method to install the new product version on a supported platform:
-
Install the new product version using the same installation type that is on the system, otherwise problems might result.
TTL expiration timestamps are susceptible to the year 2038 problem. If the TTL value is long and an expiration date that is greater than the maximum threshold of
2038-01-19T03:14:06+00:00
, the data is immediately expired and purged on the next compaction. DataStax strongly recommends upgrading to DSE 5.1.7 or later and taking required action to protect against silent data loss.
-
To configure the new product version:
-
The upgrade installs a new
server.xml
for Tomcat 8. If your existingserver.xml
has custom connectors, migrate those connectors to the newserver.xml
before starting the upgraded nodes. -
Compare changes in the new configuration files with the backup configuration files after the upgrade but before restarting, remove deprecated settings, and update any new settings if required.
You must use the new configuration files that are generated from the upgrade installation. Copy any parameters needed from your old configuration files into these new files.
Do not replace the newly-generated configuration files with the old files.
Use the DSE yaml_diff tool to compare backup YAML files with the upgraded YAML files:
cd /usr/share/dse/tools/yamls
./yaml_diff path/to/yaml-file-old path/to/yaml-file-new ... CHANGES ========= authenticator: - AllowAllAuthenticator + com.datastax.bdp.cassandra.auth.DseAuthenticator authorizer: - AllowAllAuthorizer + com.datastax.bdp.cassandra.auth.DseAuthorizer roles_validity_in_ms: - 2000 + 120000 ...
-
-
When upgrading DSE to versions earlier than 5.1.16, 6.0.8, or 6.7.4 inclusive, if any tables are using DSE Tiered Storage, remove all
txn_compaction
log files from second-level tiers and lower. For example, given the followingdse.yaml
configuration, removetxn_compaction
log files from/mnt2
and/mnt3
directories:tiered_storage_options: strategy1: tiers: - paths: - /mnt1 - paths: - /mnt2 - paths: - /mnt3
The following example removes the files using the find command:
find /mnt2 -name "*_txn_compaction_*.log" -type f -delete && find /mnt3 -name "*_txn_compaction_*.log" -type f -delete
Failure to complete this step may result in data loss.
-
DSE Analytics nodes only: If your DSE 5.0 clusters had any datacenters running in Analytics Hadoop mode and if the DseSimpleSnitch was used, you must use one of these options for starting nodes in your cluster. Select the option that works best for your environment:
-
For nodes in the datacenters running in Analytics Hadoop mode, start those nodes in Spark mode.
-
Add the start-up parameter
-Dcassandra.ignore_dc=true
for each node, then start incassandra
mode. This flag is required only once after upgrading. Subsequent restarts do not use this flag. You can leave the flag in the configuration file or remove it after the first restart of each node.
-
-
Start the node.
-
Installer-Services and Package and installations:
sudo service dse start
-
Installer-No Services and Tarball installations:
installation_dir/bin/dse cassandra
-
-
Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition:
-
Get the node’s datacenter name:
nodetool status | grep "Datacenter" Datacenter: datacenter-name
-
Verify that the node’s datacenter name matches the datacenter name for a keyspace:
cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'datacenter-name': '3'};
-
-
Review the logs for warnings, errors, and exceptions:
grep -w 'WARNING\|ERROR\|exception' /var/log/cassandra/*.log
Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.
Non-standard log locations are configured in
dse-env.sh
. -
Repeat the upgrade process on each node in the cluster following the recommended order.
-
After the entire cluster upgrade is complete: upgrade the SSTables on one node at a time or, when using racks, one rack at a time.
Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage and possible data loss. Upgrading is not complete until the SSTables are upgraded.
nodetool upgradesstables
Use the
--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.You can run the
upgradesstables
command before all the nodes are upgraded as long as you run the command on only one node at a time or when using racks, one rack at a time. Runningupgradesstables
on too many nodes at once will degrade performance.
General Post-Upgrade Steps
After all nodes are upgraded:
-
If you use the OpsCenter Repair Service, turn on the Repair Service.
-
If you encounter serialization-header errors, stop the node and repair them using the
sstablescrub -e
option:sstablescrub -e fix-only keyspace table
For more details on serialization-header errors and repairs, see DSE 5.0 SSTables with UDTs corrupted after upgrading to DSE 5.1, 6.0, or 6.7.
-
Review your security configuration. To use security, enable and configure DSE Unified Authentication.
In
cassandra.yaml
, the default authenticator isDseAuthenticator
and the default authorizer isDseAuthorizer
. Other authenticators and authorizers are no longer supported. Security is disabled indse.yaml
by default.DSE Analytics nodes: In DSE 5.1 a new web interface requires authorization when security is enabled. See Monitoring Spark with the web interface.
-
TimeWindowCompactionStrategy (TWCS (6.7) | TWCS (6.8)) is set only on new
dse_perf
anddse_audit_log
tables. Manually changedse_perf
anddse_audit_log
tables that were created in earlier releases to use TWCS. For example:ALTER TABLE dse_perf.read_latency_histograms WITH COMPACTION={'class':'TimeWindowCompactionStrategy'};
ALTER TABLE dse_audit_log.audit_log WITH COMPACTION={'class':'TimeWindowCompactionStrategy'};
Post-Upgrade Steps for DSE Search Nodes
For DSE Search nodes:
-
Index time boost support is removed in DSE 5.1.1 and later. Use query time boosting instead. Delete any
_docBoost
columns in backing CQL tables:DELETE _docBoost FROM table-name IF EXISTS; DELETE _docBoost FROM table-name IF EXISTS;
Thrift tables where the
_docBoost
column existed are allowed, but the_docBoost
column is ignored. Thrift tables are not able to drop the column. -
If
SpatialRecursivePrefixTreeFieldType
(RPT) is used in the search schema, replace theunits
field type with a suitable (degrees, kilometers, or miles)distanceUnits
, and then verify that spatial queries behave as expected. -
For optimal indexing of multipolygon shapes, you must set
useJtsMulti="false"
. For example:<fieldType autoIndex="true" useJtsMulti="false" class="solr.SpatialRecursivePrefixTreeFieldType" distErrPct="0.0125" distanceUnits="kilometers" geo="true" name="WktField" spatialContextFactory="org.locationtech. spatial4j.context.jts.JtsSpatialContextFactory"/>
-
If you are using
HTTP API
writes withJSON
documents (deprecated), a known issue may cause the auto-generatedsolrconfig.xml
to have an invalidrequestHandler
forJSON
core creations. If necessary, change the auto-generatedsolrconfig.xml
:<requestHandler name="/update/json" class="solr.UpdateUpdateRequestHandler" startup="lazy"/>
to
<requestHandler name="/update/json" class="solr.UpdateRequestHandler" startup="lazy"/>
For more information on
solrconfig.xml
, see https://lucene.apache.org/solr/guide/7_6/configuring-solrconfig-xml.html. -
Do a full reindex of all encrypted search indexes on each node in your cluster:
dsetool reload_core keyspace_name.table_name distributed=false reindex=true deleteAll=true
Plan sufficient time after the upgrade is complete to reindex with
deleteAll=true
on all nodes.
Post-Upgrade Steps for DSEFS
-Enabled Nodes
A new schema is available for DSEFS
.
The new |
A multi-datacenter setup for |
If you have no data in DSEFS
or if you are using DSEFS
only for temporary data, follow these steps to use the new schema:
-
Stop the node:
-
Package installations:
sudo service dse stop
-
Tarball installations:
installation_dir/bin/dse cassandra-stop
-
-
Clear the dsefs data directories on each node.
For example, if the
dsefs_options
section ofdse.yaml
has data_directories configured as:dsefs_options: ... data_directories: - dir: /var/lib/dsefs/data
the following command removes the directories:
rm -r /var/lib/dsefs/data/*
-
In the
dsefs_options
section ofdse.yaml
, change the keyspace_name parameter to a different name:########################## # DSE File System options dsefs_options: ... **keyspace-name: new_keyspace_name**
-
Start the node.
-
Installer-Services and Package and installations:
sudo service dse start
-
Installer-No Services and Tarball installations:
installation_dir/bin/dse cassandra
-
-
If you backed up existing
DSEFS
data before the upgrade, copy the data back intoDSEFS
from local storage.dse hadoop fs -cp /local_backup_location/* /dsefs_data_directory/
-
OPTIONAL: Drop the old
dsefs
keyspace:DROP KEYSPACE dsefs
Post-Upgrade Steps for DSE Analytics Nodes
If you are using Spark SQL tables, migrate them to the new Hive metastore format:
dse spark-sql-metatore-migrate
Post upgrade Steps for DSE Advanced Replication
Because DSE Advanced Replication is substantially revised, you must migrate to the newer version in DSE 5.1. Both V1 and V2 are run in DSE 5.1 in parallel to accomplish the migration:
-
Create a V2 destination for the hub configured in V1:
dse advrep destination create --name dest_name --addresses ip_address --transmission-enabled false
-
Create a V2 channel for each V1 channel:
dse advrep channel create --source-keyspace keyspace_name --source-table table_name --destination dest_name --source-id source_id --source-id-column source_column_id --priority 1
-
Resume collection and transmission:
dse advrep channel resume --source-keyspace keyspace_name --source-table table_name --collection-enabled true --transmission-enabled true
-
Make sure the V2 is running and replicating, then disable collection on the V1 channels:
dse advrep --v1 edge channel pause --keyspace keyspace_name --table table_name
-
Wait for V1 to drain the V1 replication log, checking for a zero count:
dse advrep --v1 edge rl-count
-
Delete V1 channels and disable V1 hub:
for f in `dse advrep --v1 edge list-conf|cut -f1 -d' '|sed 1,2d | sed s/_/-/g`; do dse advrep --v1 edge remove-conf --$f; done;
Warning Messages During and after Upgrade
You can ignore some log messages that occur during and after an upgrade:
-
When upgrading nodes with DSE Advanced Replication, there might be some
WriteTimeoutExceptions
during a rolling upgrade while mixed versions of nodes exist. Some write consistency limitations apply while mixed versions of nodes exist. TheWriteTimeout
issue is resolved after all nodes are upgraded. -
Some
gremlin_server
properties in earlier versions of DSE are no longer required. If properties exist in thedse.yaml
file after upgrading, logs display warnings similar to:WARN [main] 2017-08-31 12:25:30,523 GREMLIN DseWebSocketChannelizer.java:149 - Configuration for the org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0 serializer in dse.yaml overrides the DSE default - typically it is best to allow DSE to configure these.
You can ignore these warnings or modify
dse.yaml
so that only the required gremlin server properties are present.
Locking DSE Package Versions
If you have upgraded a DSE package installation, you can prevent future unintended upgrades.
RHEL yum
installations
To hold a package at the current version:
-
Install
yum-versionlock
(one-time operation):sudo yum install yum-versionlock
-
Lock the current DSE version:
sudo yum versionlock dse-*
To clear the version lock and enable upgrades:
sudo yum versionlock clear
For details, see the versionlock
command.
Debian apt-get
installations
To hold a package at the current version:
sudo apt-mark hold dse-*
To remove the version hold:
sudo apt-mark unhold dse-*
For details, see the apt-mark
command.