Upgrading DataStax Enterprise 5.0 to 6.7 or 6.8
Instructions for upgrading from DSE 5.0 to 6.7 or 6.8.
jvm.options
The location of the jvm.options file depends on the type of installation:Package installations | /etc/dse/cassandra/jvm.options |
Tarball installations | installation_location/resources/cassandra/conf/jvm.options |
logback.xml
The location of the logback.xml file depends on the type of installation:Package installations | /etc/dse/cassandra/logback.xml |
Tarball installations | installation_location/resources/cassandra/conf/logback.xml |
Upgrade order
Upgrade nodes in this order:- In multiple datacenter clusters, upgrade every node in one datacenter before upgrading another datacenter.
- Upgrade the seed nodes within a datacenter first.
- Upgrade nodes in this order:
- DSE Analytics datacenters
- Transactional/DSE Graph datacenters
- DSE Search datacenters
OpsCenter version | DSE version |
---|---|
6.8 | 6.8, 6.7, 6.0, 5.1 |
6.7 | 6.7, 6.0, 5.1 |
6.5 | 6.0, 5.1, 5.0 (EOL) |
6.1 | 5.1, 5.0 (EOL), 4.8 (EOSL) |
6.0 | 5.0 (EOL), 4.8 (EOSL), 4.7 (EOSL) |
DataStax Enterprise and Apache Cassandra™ configuration files
Configuration file | Installer-Services and package installations | Installer-No Services and tarball installations |
---|---|---|
DataStax Enterprise configuration files | ||
dse | /etc/default/dse (systemd) or /etc/init.d/ (SystemV) | N/A. Node type is set via command line flags. |
byoh-env.sh | /etc/dse/byoh-env.sh | install_location/bin/byoh-env.sh |
dse.yaml | /etc/dse/dse.yaml | install_location/resources/dse/conf/dse.yaml |
logback.xml | /etc/dse/cassandra/logback.xml | install_location/resources/logback.xml |
spark-env.sh | /etc/dse/spark/spark-env.sh | install_location/resources/spark/conf/spark-env.sh |
spark-defaults.conf | /etc/dse/spark/spark-defaults.conf | install_location/resources/spark/conf/spark-defaults.conf |
Cassandra configuration files | ||
cassandra.yaml | /etc/dse/cassandra/cassandra.yaml | install_location/conf/cassandra.yaml |
cassandra.in.sh | /usr/share/cassandra/cassandra.in.sh | install_location/bin/cassandra.in.sh |
cassandra-env.sh | /etc/dse/cassandra/cassandra-env.sh | install_location/conf/cassandra-env.sh |
cassandra-rackdc.properties | /etc/dse/cassandra/cassandra-rackdc.properties | install_location/conf/cassandra-rackdc.properties |
cassandra-topology.properties | /etc/dse/cassandra/cassandra-topology.properties | install_location/conf/cassandra-topology.properties |
jmxremote.password | /etc/cassandra/jmxremote.password | install_location/conf/jmxremote.password |
Tomcat server configuration file | ||
server.xml | /etc/dse/resources/tomcat/conf/server.xml | install_location/resources/tomcat/conf/server.xml |
server.xml
The default location of the Tomcat server.xml file depends on the installation type:Package installations | /etc/dse/tomcat/conf/server.xml |
Tarball installations | installation_location/resources/tomcat/conf/server.xml |
DataStax driver changes
DataStax drivers come in two types:
- DataStax drivers for DataStax Enterprise — for use by DSE 4.8 and later
- DataStax drivers for Apache Cassandra™ — for use by Apache Cassandra™ and DSE 4.7 and earlier
dse-env.sh
The default location of the dse-env.sh file depends on the type of installation:Package installations | /etc/dse/dse-env.sh |
Tarball installations | installation_location/bin/dse-env.sh |
dse.yaml
The location of the dse.yaml file depends on the type of installation:Package installations | /etc/dse/dse.yaml |
Tarball installations | installation_location/resources/dse/conf/dse.yaml |
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:Package installations | /etc/dse/cassandra/cassandra.yaml |
Tarball installations | installation_location/resources/cassandra/conf/cassandra.yaml |
The upgrade process for DataStax Enterprise provides minimal downtime (ideally zero). During this process, upgrade and restart one node at a time while other nodes continue to operate online. With a few exceptions, the cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.
zerocopy_streaming_enabled=false
in
cassandra.yaml and perform a rolling restart, AND/OR
run upgradesstables
on all nodes in your cluster before adding new nodes,
running repair, or restoring from backups. This bug will be addressed in DSE 6.8.9.Back up your existing installation
Upgrade SSTables
Version specific notes
facet.limit=-1
. The maximum facet limit value is 20,000 as
set by solr.max.facet.limit.size
. While the facet limit size can be
overriden using -Dsolr.max.facet.limit.size
in
jvm.options, it is not recommended.timeAllowed
parameter is enabled by default to prevent long running
shard queries, such as complex facets and Boolean queries, from using system resources
after they have timed out from the DSE Search coordinator. For details, see Limiting queries by timeUpgrade restrictions and limitations
Restrictions and limitations apply while a cluster is in a partially upgraded state. The cluster continues to work as though it were on the earlier version of DataStax Enterprise until all of the nodes in the cluster are upgraded.
General restrictions
- Do not enable new features.
- Ensure OpsCenter compatibility. See the compatibility table.
- Do not run nodetool repair.
- Stop the OpsCenter Repair Service if enabled: 6.5 | 6.7 | 6.8.
- During the upgrade, do not bootstrap new nodes or decommission existing nodes.
- Do not issue TRUNCATE or DDL related queries during the upgrade process.
- Do not alter schemas for any workloads.
- Complete the cluster-wide upgrade before the expiration of
gc_grace_seconds
(approximately 13 days) to ensure any repairs complete successfully. - If the DSE Performance Service was disabled before the upgrade, do not enable it during the upgrade. See DSE Performance Service: 6.7 | 6.0 | 5.1 | 5.0 | OpsCenter 6.8 | OpsCenter 6.7 | OpsCenter 6.5.
Restrictions for nodes using security
- Do not change security credentials or permissions until the upgrade is complete on all nodes.
- If you are not already using Kerberos, do not set up Kerberos authentication before upgrading. First upgrade the cluster, and then set up Kerberos.
Restrictions for DSE Analytics nodes
DSE 6.7 uses a different version of Spark (2.2) as does DSE 6.8 (2.4). Do not run Spark analytics jobs on a datacenter until all nodes are upgraded and applications have been recompiled for Spark 2.2 or 2.4 depending upon your upgrade version.
Restrictions for DSE Advanced Replication nodes
Upgrades are supported only for DSE Advanced Replication V2.
Restrictions for DSE Search nodes
- Do not update DSE Search configurations or schemas.
- Do not reindex DSE Search nodes during upgrade.
- DSE 5.1 and 6.x use a different Lucene codec than DSE 5.0 for new search cores. Segments written with this new codec cannot be read by earlier versions of DSE.
Driver version impacts
Be sure to check driver compatibility. Depending on the driver version, you might need to recompile your client application code. See DataStax driver changes.
- Protocol version: Set the protocol version explicitly in your application at start up. Switch to the Java driver to the new protocol version only after the upgrade is complete on all nodes in the cluster.
- Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest DSE version or protocol version. For example, the initial contact points contain only protocol version 2.
Advanced preparation for upgrading DSE Search and SearchAnalytics nodes
Before continuing, complete all the advanced preparation steps on DSE Search and SearchAnalytics nodes while DSE 5.0 is still running.
Make the following changes as required:
- Change HTTP API queries to CQL queries:
- Delete-by-id is removed, use CQL DELETE by primary key instead.
- Delete-by-query no longer supports wildcards, use CQL TRUNCATE instead.
- If any Solr core was created on DSE 4.6 or earlier
and never reindexed after being upgraded to DSE 4.7 or later, you must reindex on DSE 5.0
before upgrading to DSE
6.x:
dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
Important: You must reindex all nodes before beginning the upgrade. - If you are using Apache Solr SolrJ, the minimum required Solr version is 6.0.0. To find the current Solr
version:
installation_directory/bin/solr status
For information on upgrading Apache Solr, see Upgrading Solr.
- For SpatialRecursivePrefixTreeFieldType (RPT) in search schemas, you must adjust your
queries for these changes:
- IsDisjointTo is no longer supported in queries on
SpatialRecursivePrefixTreeFieldType. Replace IsDisjointTo with a NOT Intersects
query. For example:
foo:0,0 TO 1000,1000 AND -"Intersects(POLYGON((338 211, 338 305, 404 305, 404 211, 338 211)))")
- The ENVELOPE syntax is now required for WKT-style queries against
SpatialRecursivePrefixTreeFieldType fields. You must specify
ENVELOPE(10, 15, 15, 10)
, where queries on earlier releases could specify10 10 15 15
.
- IsDisjointTo is no longer supported in queries on
SpatialRecursivePrefixTreeFieldType. Replace IsDisjointTo with a NOT Intersects
query. For example:
- The
Circle
syntax is no longer a part of Well-Known-Text (WKT); therefore,Spatial Search queries such as:Intersects(Circle(10 10 d=2))
must be rewritten as:Intersects(BUFFER(POINT(10 10), 2)
- Edit the solrconfig.xml file and make these changes, as needed:
- Remove these unsupported Solr requestHandlers:
- XmlUpdateRequestHandler
- BinaryUpdateRequestHandler
- CSVRequestHandler
- JsonUpdateRequestHandler
- DataImportHandler
For example:
<requestHandler name="/dataimport" class="solr.DataImportHandler"/>
or
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler"/>
- Change the directoryFactory
from:
<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
to<directoryFactory name="DirectoryFactory" class="solr.StandardDirectoryFactory"/>
<unlockOnStartup>
is unsupported.- Change the updateLog
from:
<updateLog class="solr.FSUpdateLog" force="false">
to<updateLog force="false">
Tip: For more information on solrconfig.xml, see https://lucene.apache.org/solr/guide/7_6/configuring-solrconfig-xml.html.
- Remove these unsupported Solr requestHandlers:
- Upgrading DSE search nodes requires replacing unsupported
Solr types with supported types. Note: Special handling is also required for BCDStrField, addressed in step 10.Sorting limitations apply to mixed version clusters. Some of the removed Solr types, due to the way they marshal sort values during distributed queries (combined with the way the suggested new types unmarshal sort values), cannot be sorted on during rolling upgrades when some nodes use an unsupported type and other nodes use the suggested new type. The following type transitions are problematic:
Two options are available:Removed Solr field types Supported Solr field types ByteField TrieIntField DateField TrieDateField BCDIntField TrieIntField BCDLongField TrieLongField - Avoid sorting on removed Solr field types until the upgrade is complete for all
nodes in the datacenter being queried. Tip: When using two search datacenters, isolate queries to a single datacenter and then change the schema and reindex the other datacenter. Then isolate queries to the newly reindexed datacenter while you change the schema and upgrade the first datacenter.
- If you are using BCDIntField or BCDLongField, update the schema to replace
BCDIntField and BCDLongField with types that are sort-compatible with the supported
Solr types TrieIntField and TrieLongField:
Change the schema in a distributed fashion, and do not reindex. After the schema is updated on all nodes, then go on to 9.Removed Solr field types Interim sort-compatible supported Solr field types BCDIntField SortableIntField BCDLongField SortableLongField
- Avoid sorting on removed Solr field types until the upgrade is complete for all
nodes in the datacenter being queried.
- Update the schema and configuration for the Solr field types that are
removed from Solr 5.5 and later.
- Update the schema to replace unsupported Solr field types with supported Solr
field types:
Removed Solr field types Supported Solr field types ByteField TrieIntField DateField TrieDateField DoubleField TrieDoubleField FloatField TrieFloatField IntField TrieIntField LongField TrieLongField ShortField TrieIntField SortableDoubleField TrieDoubleField SortableFloatField TrieFloatField SortableIntField TrieIntField SortableLongField TrieLongField BCDIntField TrieIntField BCDLongField TrieLongField BCDStrField (see #upgdDSE50to67__d17e4975 if used) TrieIntField - If you are using type mapping version 0, or you do not specify a type
mapper, verify or update the solrconfig.xml to use
dseTypeMappingVersion
1:
<dseTypeMappingVersion>1</dseTypeMappingVersion>
If the Solr core is backed by a CQL table and the type mapping is unspecified, use type mapping version 2.Tip: For more information on solrconfig.xml, see https://lucene.apache.org/solr/guide/7_6/configuring-solrconfig-xml.html. - Reload the
core:
dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath
If you were using the unsupported data types, do a full reindex node-by-node:dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
Note: In DSE 5.1 and later, auto generated schemas use data type mapper 2. - Update the schema to replace unsupported Solr field types with supported Solr
field types:
- If using BCDStrField: In DSE 5.0 and earlier, DSE
mapped Cassandra text columns to BCDStrField. The deprecated BCDStrField is removed.The recommended strategy is to upgrade the data type to TrieIntField. However, DSE cannot map text directly to TrieIntField. If you are using BCDStrField, you must complete one of these options before the upgrade:
- If BCDStrField is no longer used, remove the BCDStrField field from the Solr schema. Reindexing is not required.
- If you want to index the field as a TrieIntField, and a full reindex is acceptable, change the underlying database column to use the type int.
- If you want to keep the database column as text and you still want to do simple matching queries on the indexed field, switch from BCDStrField to StrField in the schema. Indexing should not be required, but the field will no longer be appropriate for numeric range queries or sorting because StrField uses a lexicographic order, not a numeric one.
- Not recommended: If you want to keep the database column as text and still
want to perform numeric range queries and sorts on the former BCDStrField, but would
rather change their application than perform a full reindex:
- Change the field to StrField in the Solr schema with indexed=false.
- Add a new copy field with the type TrieIntField that has its values supplied by the original BCDStrField.
After you make these schema changes, do a rolling, node-by-node reload_core:dsetool reload_core keyspace_name.table_name schema=filepath solrconfig=filepath reindex=true deleteAll=true distributed=false
Note: If you have two datacenters and upgrade them one at a time, reload the core with distributed=true and deleteAll=true. - Tune the schema before you upgrade. After the upgrade, all field definitions in the schema are validated and must be DSE Search compatible, even if the fields are not indexed, have docValues applied, or used for copy-field source. The default behavior of automatic resource generation includes all columns. To improve performance, take action to prevent the fields from being loaded from the database. Include only the required fields in the schema by removing or commenting out unused fields in the schema.
Advanced preparation for upgrading DSE Graph nodes with search indexes
These steps apply to graph nodes that have search indexes. Before continuing, complete these advanced preparation steps while DSE 5.0 is still running.
Edit solrconfig.xml and make these changes, as needed:
- Remove these unsupported Solr requestHandlers:
- XmlUpdateRequestHandler
- BinaryUpdateRequestHandler
- CSVRequestHandler
- JsonUpdateRequestHandler
- DataImportHandler
For example:
<requestHandler name="/dataimport" class="solr.DataImportHandler"/>
or
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler"/>
- Change the directoryFactory
from:
<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
to<directoryFactory name="DirectoryFactory" class="solr.StandardDirectoryFactory"/>
- Remove
<unlockOnStartup>
. - Reload the
core:
dsetool reload_core keyspace_name.table_name reindex=false
Advanced preparation for upgrading DSE Analytics nodes
Before upgrading DSE Analytics nodes:
- DSE versions earlier than 5.1 use an older version of
Spark and applications written using that version (1.6) may not be compatible with Spark
2.2. You must recompile all DSE 5.0 Scala Spark applications against Scala 2.11 and use
only Scala 2.11 third-party libraries.Important: Changing the
dse-spark-dependencies
in your build files is not sufficient to change the compilation target. See the example projects for how to set up your build files. - Cassandra File System (CFS) is removed. Remove the
cfs
andcfs_archive
keyspaces before upgrading. See the From CFS to DSEFS blog post and the Copying data from CFS to DSEFS documentation for more information.DROP KEYSPACE cfs
DROP KEYSPACE cfs_archive
- If DSEFS is enabled, copy CFS hivemetastore
directory to dse:
DSE_HOME/bin/dse hadoop fs -cp cfs://node_ip_address/user/spark/warehouse/ dsefs://node_ip_address/user/spark/warehouse/
- Spark applications should use
dse://
URLs instead ofspark://spark_master_IP:Spark_RPC_port_number
URLs, as described in Specifying Spark URLs (6.7) | Specifying Spark URLs (6.8). - Modify calls to setMaster and setAppName.
For example, the following code works in DSE 5.0 but will not work in DSE 5.1 or later.
val conf = new SparkConf(true) .setMaster("spark://192.168.123.10:7077") .setAppName("cassandra-demo") .set("cassandra.connection.host" , "192.168.123.10") // initial contact .set("cassandra.username", "cassandra") .set("cassandra.password", "cassandra") val sc = new SparkContext(conf)
To connect, modify the call to
setMaster
:val conf = new SparkConf(true) .appName("cassandra-demo") .master("dse://192.168.123.10:7077") .set("cassandra.connection.host" , "192.168.123.10") // initial contact .set("cassandra.username", "cassandra") .set("cassandra.password", "cassandra") val sc = new SparkContext(conf)
Preparing to upgrade
Follow these steps to prepare each node for the upgrade:
- If you are upgrading from DSE 5.0 to DSE 6.8, upgrade to DSE 5.1 first and then follow the instructions in this section. See Upgrading DataStax Enterprise 5.0 to 5.1.
- Upgrade to the latest patch release on your current version. Fixes included in
the latest patch release can simplify the upgrade process.Get the current DSE version:
bin/dse -v current_dse_version
- If you are upgrading from DSE 5.0 to DSE 6.8, upgrade to DSE 5.1 first and then follow the instructions in this section. See Upgrading DataStax Enterprise 5.0 to 5.1.
- Familiarize yourself with the changes and features in the
new release:
- DataStax Enterprise 6.7 release notes | 6.8 release notes.
- General upgrading advice for any version. Be sure to read 6.7 NEWS.txt | 6.8 NEWS.txt all the way back to your current version.
- DataStax Enterprise changes in 6.7 CHANGES.txt | 6.8 CHANGES.txt.
- DataStax driver changes.
- Before upgrading, be sure that each node has adequate free disk
space. Determine current DSE data disk space usage:
sudo du -sh /var/lib/cassandra/data/ 3.9G /var/lib/cassandra/data/
Determine available disk space:sudo df -hT / Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 ext4 59G 16G 41G 28% /
Important: The required space depends on the compaction strategy. See Disk space - Replace ITriggers and custom interfaces.
All custom implementations, including the following interfaces, must be replaced with supported implementations when upgrading to DSE 6.x:
- The
org.apache.cassandra.triggers.ITrigger
interface was modified fromaugment
toaugmentNonBlocking
for non-blocking internal architecture. Updated trigger implementations must be provided on upgraded nodes. If unsure, drop all existing triggers before upgrading. To check for existing triggers:SELECT * FROM system_schema.triggers;
DROP TRIGGER trigger_name ON keyspace_name.table_name;
- The
org.apache.cassandra.index.Index
interface was modified to comply with the core storage engine changes. Updated implementations are required. If unsure, drop all existing custom secondary indexes before upgrading, except DSE Search indexes, which do not need to be replaced. To check for existing indexes:SELECT * FROM system_schema.indexes;
DROP INDEX index_name;
- The
org.apache.cassandra.cql3.QueryHandler
,org.apache.cassandra.db.commitlog.CommitLogReadHandler
, and other extension points have been changed. See QueryHandlers.
Tip: For help contact the DataStax Services team. - The
- Support for Thrift-compatible tables (COMPACT STORAGE) is
dropped. Before upgrading, migrate all non-system tables that have COMPACT STORAGE
to CQL table
format:
cqlsh -e 'DESCRIBE FULL SCHEMA;' > schema_file
cat schema_file | while read -d $';\n' line ; do if echo "$line"|grep 'COMPACT STORAGE' 2>&1 > /dev/null ; then TBL="`echo $line|sed -e 's|^CREATE TABLE \([^ ]*\) .*$|\1|'`" if echo "$TBL"|egrep -v '^system' 2>&1 > /dev/null; then echo "ALTER TABLE $TBL DROP COMPACT STORAGE;" >> schema-drop-list fi fi done
cqlsh -f schema-drop-list
Note: The script above dumps the complete DSE schema to schema_file, uses grep to find lines containing COMPACT STORAGE, and then writes only those table names to schema-drop-list along with the required ALTER TABLE commands. The schema-drop-list file is then read by cqlsh which runs the ALTER TABLE commands contained therein.Warning: DSE will not start if tables using COMPACT STORAGE are present. - If audit logging is configured to use CassandraAuditWriter (6.7) | CassandraAuditWriter (6.8), run these CQL commands
as superuser on DSE 5.0
nodes:
ALTER TABLE dse_audit.audit_log ADD authenticated text;
ALTER TABLE dse_audit.audit_log ADD consistency text
Ensure that the entire cluster has schema agreement:
nodetool describecluster Cluster Information: Name: Test Cluster Snitch: com.datastax.bdp.snitch.DynamicEndpointSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 0fffd971-b7a4-33ae-859d-8ca792cd2852: [10.116.138.23]
If there are any schema discrepancies, restart the nodes in question and rerun
nodetool describecluster
until there is only one schema version in the output. - Upgrade the SSTables on each node to ensure that all
SSTables are on the current version:
nodetool upgradesstables
Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage.Tip: Use the--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.If the SSTables are already on the current version, the command returns immediately and no action is taken.
- Ensure that keyspace replication factors are correct
for your environment:
cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'replication_factor': '3'} AND durable_writes = true;
- Check the keyspace replication factor for analytics keyspaces (6.7) | analytics keyspaces (6.8).
- Check the keyspace replication factor for system_auth and dse_security (6.7) | system_auth and dse_security (6.8) keyspaces.
- Verify the Java runtime version and upgrade to the recommended version.
java -version openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
- Recommended: OpenJDK 8 (1.8.0_151 minimum)
Note: Recommendation changed due to the end of public updates for Oracle JRE/JDK 8. See Oracle Java SE Support Roadmap.
- Supported: Oracle Java SE 8 (JRE or JDK) (1.8.0_151 minimum)
Important: Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8. - Recommended: OpenJDK 8 (1.8.0_151 minimum)
- Run nodetool repair (6.7) | nodetool repair (6.8) to ensure that data on each
replica is consistent with data on other
nodes:
nodetool repair -pr
- Install the libaio package for optimal performance. RHEL platforms:
sudo yum install libaio
Debian:sudo apt-get install libaio1
- Back up any customized
configuration files since they may be overwritten with
default values during installation of the new version. Tip: If you backed up your installation using instructions in Backing up and restoring DSE, your original configuration files are included in the archive.
Upgrade steps
- If you are upgrading from DSE 5.0 to DSE 6.8, upgrade to DSE 5.1 first and then follow the instructions in this section. See Upgrading DataStax Enterprise 5.0 to 5.1.
- Flush the commit log of the current
installation:
nodetool drain
- DSE Analytics nodes only: Kill all Spark worker
processes:
for pid in $(jps | grep Worker | awk '{print $1}'); do kill -9 $pid; done
- Stop the node:
- Package
installations:
sudo service dse stop
- Tarball
installations:
installation_dir/bin/dse cassandra-stop
- Package
installations:
- Use the appropriate method to install the new product version on a
supported platform:
- Package installer using YUM (6.7) | Package installer using YUM (6.8)
- Package installer using APT (6.7) | Package installer using APT (6.8)
- Binary tarball installer (6.7) | Binary tarball installer (6.8)
Warning: Install the new product version using the same installation type that is on the system, otherwise problems might result. - To configure the new version:
- The upgrade installs a new server.xml for Tomcat 8. If your existing server.xml has custom connectors, migrate those connectors to the new server.xml before starting the upgraded nodes.
-
Compare changes in the new configuration files with the backup configuration files after the upgrade but before restarting, remove deprecated settings, and update any new settings if required.Warning: Do not simply replace new configuration files with old. Rather compare your old files to the new files and make any required changes.Tip: Use the DSE yaml_diff tool (6.7) | yaml_diff tool (6.8) to compare backup YAML files with the upgraded YAML files:
cd /usr/share/dse/tools/yamls
./yaml_diff path/to/yaml-file-old path/to/yaml-file-new ... CHANGES ========= authenticator: - AllowAllAuthenticator + com.datastax.bdp.cassandra.auth.DseAuthenticator authorizer: - AllowAllAuthorizer + com.datastax.bdp.cassandra.auth.DseAuthorizer roles_validity_in_ms: - 2000 + 120000 ...
cassandra.yaml changes
Table 1. RPC settings Deprecated cassandra.yaml settings: rpc_address rpc_broadcast_address
Replacement settings: native_transport_address native_transport_broadcast_address
Table 2. Memtable settings Deprecated cassandra.yaml settings: memtable_heap_space_in_mb memtable_offheap_space_in_mb
Replacement setting: memtable_space_in_mb
Changed setting memtable_allocation_type: offheap_objects
Table 3. User-defined function (UDF) settings Deprecated cassandra.yaml settings: user_defined_function_warn_timeout user_defined_function_fail_timeout
Replacement settings: user_defined_function_warn_micros: 500 user_defined_function_fail_micros: 10000 user_defined_function_warn_heap_mb: 200 user_defined_function_fail_heap_mb: 500 user_function_timeout_policy: die
Settings are in microseconds. The new timeouts are not equivalent to the deprecated settings.
Table 4. Internode encryption settings Deprecated cassandra.yaml setting: server_encryption_options: store_type: JKS
Replacement settings: server_encryption_options: keystore_type: JKS truststore_type: JKS
Valid type options are JKS, JCEKS, PKCS11, or PKCS12 for
keystore_type
, and JKS, JCEKS, or PKCS12 fortruststore_type
.Important: For security reasons, DSE 6.8 only allows the TLS encryption option protocol:server_encryption_options: ... protocol: TLS
See https://www.oracle.com/technetwork/java/javase/8u31-relnotes-2389094.html#newft for details.
Table 5. Client-to-node encryption settings Deprecated cassandra.yaml setting: client_encryption_options: store_type: JKS
Replacement settings: client_encryption_options: keystore_type: JKS truststore_type: JKS
Valid type options are JKS, JCEKS, PKCS11, or PKCS12 for
keystore_type
, and JKS, JCEKS, or PKCS12 fortruststore_type
.Important: For security reasons, DSE 6.8 only allows the TLS encryption option protocol:client_encryption_options: ... protocol: TLS
See https://www.oracle.com/technetwork/java/javase/8u31-relnotes-2389094.html#newft for details.
dse.yaml changes
Table 6. Shard transport Deprecated dse.yaml settings: shard_transport_options: type: html netty_server_port: 8984 netty_server_acceptor_threads: netty_server_worker_threads: netty_client_worker_threads: netty_client_max_connections: netty_client_request_timeout: http_shard_client_conn_timeout: 0 http_shard_client_socket_timeout: 0
Replacement settings: shard_transport_options: netty_client_request_timeout: 60000
Remove any other options undershard_transport_options
.Table 7. DSE Search node changes Deprecated dse.yaml settings: Remove these options: cql_solr_query_executor_threads enable_back_pressure_adaptive_nrt_commit max_solr_concurrency_per_core solr_indexing_error_log_options
Warning: DSE 6.x will not start with those options present.Table 8. DSE Analytics node changes Changed dse.yaml settings: The dsefs_enabled:
settings are commented out. To enable DSEFS, uncomment alldsefs_options:
settings.
- When upgrading DSE to versions earlier than 5.1.16, 6.0.8, or
6.7.4 inclusive, if any tables are using DSE Tiered Storage, remove all
txn_compaction
log files from second-level tiers and lower. For example, given the following dse.yaml configuration, removetxn_compaction
log files from/mnt2
and/mnt3
directories:tiered_storage_options: strategy1: tiers: - paths: - /mnt1 - paths: - /mnt2 - paths: - /mnt3
The following example removes the files using the find command:
find /mnt2 -name "*_txn_compaction_*.log" -type f -delete && find /mnt3 -name "*_txn_compaction_*.log" -type f -delete
Warning: Failure to complete this step may result in data loss. - Remove any previously installed JTS JAR files from the CLASSPATHS in your DSE installation. JTS (Java Topology Suite) is distributed with DSE 6.7.
- DSE Analytics nodes only: If your DSE 5.0 clusters had
any datacenters running in Analytics Hadoop mode and if the DseSimpleSnitch was used, you
must use one of these options for starting nodes in your cluster. Select the option that
works best for your environment:
- For nodes in the datacenters running in Analytics Hadoop mode, start those nodes in Spark mode (6.7) | Spark mode (6.8).
- Add the start-up parameter (6.7) | start-up parameter (6.8)
-Dcassandra.ignore_dc=true
for each node, then start in cassandra mode. This flag is required only once after upgrading. Subsequent restarts do not use this flag. You can leave the flag in the configuration file or remove it after the first restart of each node.
- Start the node.
- Package installations (6.7) | Package installations
(6.8):
sudo service dse start
- Tarball installations (6.7) | Tarball installations
(6.8):
installation_dir/bin/dse cassandra
- Package installations (6.7) | Package installations
(6.8):
- Verify that the upgraded datacenter names match the
datacenter names in the keyspace schema definition:
- Get the node's datacenter
name:
nodetool status | grep "Datacenter" Datacenter: datacenter-name
- Verify that the node's datacenter name matches the datacenter name for a
keyspace:
cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'datacenter-name': '3'};
- Get the node's datacenter
name:
- Review the logs for warnings, errors, and
exceptions:
grep -w 'WARNING\|ERROR\|exception' /var/log/cassandra/*.log
Warnings, errors, and exceptions are frequently found in the logs when starting an upgraded node. Some of these log entries are informational to help you execute specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support.Tip: Non standard log locations are configured in dse-env.sh. - Repeat the upgrade process on each node in the cluster following the recommended order.
- After the entire cluster upgrade is complete: upgrade
the SSTables on one node at a time or, when using racks, one rack at a time.Warning: Failure to upgrade SSTables when required results in a significant performance impact and increased disk usage and possible data loss. Upgrading is not complete until the SSTables are upgraded.
nodetool upgradesstables
Tip: Use the--jobs
option to set the number of SSTables that upgrade simultaneously. The default setting is 2, which minimizes impact on the cluster. Set to 0 to use all available compaction threads. DataStax recommends running theupgradesstables
command on one node at a time or when using racks, one rack at a time.Important: You can run theupgradesstables
command before all the nodes are upgraded as long as you run the command on only one node at a time or when using racks, one rack at a time. Runningupgradesstables
on too many nodes at once will degrade performance.
General post upgrade steps
After all nodes are upgraded:
- If you use the OpsCenter Repair Service, turn on the Repair Service (6.7) | turn on the Repair Service (6.8).
- If you encounter serialization-header errors, stop the
node and repair them using the sstablescrub -e
option:
sstablescrub -e fix-only keyspace table
For more details on serialization-header errors and repairs, see DSE 5.0 SSTables with UDTs corrupted after upgrading to DSE 5.1, 6.0, or 6.7
- Drop the following legacy tables, if they exist: system_auth.users,
system_auth.credentials, and
system_auth.permissions:
DROP TABLE IF EXISTS system_auth.users;
DROP TABLE IF EXISTS system_auth.credentials;
DROP TABLE IF EXISTS system_auth.permissions;
- Review your security configuration. To use security, enable and
configure DSE Unified Authentication 6.7 | DSE Unified Authentication 6.8.
In cassandra.yaml, the default authenticator is DseAuthenticator and the default authorizer is DseAuthorizer. Other authenticators and authorizers are no longer supported. Security is disabled in dse.yaml by default.
- TimeWindowCompactionStrategy (TWCS (6.7) | TWCS (6.8)) is set only on new dse_perf and
dse_audit_log tables. Manually change dse_perf and dse_audit_log tables that were created
in earlier releases to use TWCS. For
example:
ALTER TABLE dse_perf.read_latency_histograms WITH COMPACTION={'class':'TimeWindowCompactionStrategy'};
ALTER TABLE dse_audit_log.audit_log WITH COMPACTION={'class':'TimeWindowCompactionStrategy'};
- DSE 6.7 introduces, and enables by default, the DSE Metrics Collector, a diagnostics information aggregator used to help facilitate DSE problem resolution. For more information on the DSE Metrics Collector, or to disable metrics collection, see DataStax Enterprise Metrics Collector (6.7) | DataStax Enterprise Metrics Collector (6.8).
Post upgrade steps for DSE Analytics nodes
For DSE Analytics nodes:
- Spark Jobserver uses DSE custom version 8.0.4.45. Ensure that applications use the compatible Spark Jobserver API from the DataStax repository.
- If you are using Spark SQL tables, migrate them to the new
Hive metastore
format:
dse client-tool spark metastore migrate --from 5.0.0 --to 6.7.0
Post upgrade steps for DSEFS enabled nodes
A new schema is available for DSEFS.
If you have no data in DSEFS or if you are using DSEFS only for temporary data, follow these steps to use the new schema:
- Stop the node:
- Package
installations:
sudo service dse stop
- Tarball
installations:
installation_dir/bin/dse cassandra-stop
- Package
installations:
- Clear the dsefs data directories on each node.
For example, if the dsefs_options section of dse.yaml has data_directories configured as:
dsefs_options: ... data_directories: - dir: /var/lib/dsefs/data
this command removes the directories:rm -r /var/lib/dsefs/data/*
- In the dsefs_options
section of dse.yaml, change the keyspace_name
parameter to a different
name:
########################## # DSE File System options dsefs_options: ... keyspace-name: new_keyspace_name
- Start the node.
- Package installations (6.7) | Package installations
(6.8):
sudo service dse start
- Tarball installations (6.7) | Tarball installations
(6.8):
installation_dir/bin/dse cassandra
- Package installations (6.7) | Package installations
(6.8):
- If you backed up existing DSEFS data before the
upgrade, copy the data back into DSEFS from local
storage.
dse hadoop fs -cp /local_backup_location/* /dsefs_data_directory/
- OPTIONAL: Drop the old dsefs
keyspace:
DROP KEYSPACE dsefs
Post upgrade steps for DSE Search nodes
For DSE Search nodes:
- The appender SolrValidationErrorAppender and the logger SolrValidationErrorLogger are no longer used and may safely be removed from logback.xml.
- In contrast to earlier versions, DataStax recommends accepting the new default value of 1024 for back_pressure_threshold_per_core (6.7) | back_pressure_threshold_per_core (6.8) in dse.yaml. See Configuring and tuning indexing performance (6.7) | Configuring and tuning indexing performance (6.8).
- If
SpatialRecursivePrefixTreeFieldType
(RPT) is used in the search schema, replace the units field type with a suitable (degrees, kilometers, or miles) distanceUnits, and then verify that spatial queries behave as expected. - If you are using HTTP API writes with JSON
documents (deprecated), a known issue may cause the auto-generated
solrconfig.xml to have invalid requestHandler for JSON core
creations. If necessary, change the auto generated
solrconfig.xml:
<requestHandler name="/update/json" class="solr.UpdateUpdateRequestHandler" startup="lazy"/>
to<requestHandler name="/update/json" class="solr.UpdateRequestHandler" startup="lazy"/>
Tip: For more information on solrconfig.xml, see https://lucene.apache.org/solr/guide/7_6/configuring-solrconfig-xml.html. - Do a full reindex of all encrypted search indexes on
each node in your
cluster:
dsetool reload_core keyspace_name.table_name distributed=false reindex=true deleteAll=true
Important: Plan sufficient time after the upgrade is complete to reindex withdeleteAll=true
on all nodes.
Warning messages during and after upgrade
You can ignore some log messages that occur during and after an upgrade:
- When upgrading nodes with DSE Advanced Replication, there might be some WriteTimeoutExceptions during a rolling upgrade while mixed versions of nodes exist. Some write consistency limitations apply while mixed versions of nodes exist. The WriteTimeout issue is resolved after all nodes are upgraded.
- Some gremlin_server properties in earlier versions of DSE are no longer required. If
properties exist in the dse.yaml file after upgrading,
logs display warnings similar
to:
WARN [main] 2017-08-31 12:25:30,523 GREMLIN DseWebSocketChannelizer.java:149 - Configuration for the org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0 serializer in dse.yaml overrides the DSE default - typically it is best to allow DSE to configure these.
You can ignore these warnings or modify dse.yaml so that only the required gremlin server properties are present.
Locking DSE package versions
If you have upgraded a DSE package installation, you can prevent future unintended upgrades.
RHEL yum installations
- Install
yum-versionlock
(one-time operation):sudo yum install yum-versionlock
- Lock the current DSE
version:
sudo yum versionlock dse-*
sudo yum versionlock clear
For details on the versionlock
command, see http://man7.org/linux/man-pages/man1/yum-versionlock.1.html.
Debian apt-get installations
sudo apt-mark hold dse-*
sudo apt-mark unhold dse-*
For details on the apt-mark command, see http://manpages.ubuntu.com/manpages/bionic/man8/apt-mark.8.html.