Migrate DataStax Enterprise (DSE) 5.1 to Hyper-Converged Database (HCD) 1.1
You can migrate from DSE 5.1 to HCD 1.1 using Zero Downtime Migration (ZDM) or an in-place upgrade. This guide explains the steps to migrate using an in-place upgrade.
| DataStax strongly recommends ZDM tools to migrate from DSE 5.1 to HCD 1.1. Only perform an in-place upgrade if you cannot use ZDM. | 
Migration warnings
This section contains important information you must understand before beginning the migration process. To improve your chances of a successful, error-free migration, review these limitations and considerations, taking action or planning ahead as needed.
DSE advanced workloads will cause data loss and failures
| HCD doesn’t support DSE advanced workloads. Data loss and application failures will occur if you migrate without addressing advanced workloads. | 
HCD 1.1 does not support:
- 
DSE Search 
- 
DSE Analytics 
- 
DSE Graph 
Before migrating, you must do the following:
- 
Identify all applications using these workloads. 
- 
Plan alternative solutions, such as an external Spark cluster. 
- 
Migrate data to supported formats within your DSE cluster. 
- 
Update application code to use new APIs. 
Migrate COMPACT STORAGE tables
DSE 5.1 may contain tables using the deprecated COMPACT STORAGE format, which is not supported in HCD 1.1.
Before migrating, you must migrate all non-system tables from COMPACT STORAGE to the standard CQL table format to prevent startup failures.
Use storage port 7000 for online upgrades
Online upgrades require the default storage port 7000.
A cluster that uses non-default storage_port values must use ZDM.
Verify your storage port configuration before you begin the upgrade process.
Avoid certain operations while nodes are partially upgraded
Don’t change credentials, permissions, or any other security settings.
Complete the cluster-wide upgrade before the expiration of gc_grace_seconds (approximately 13 days) to ensure any repairs complete successfully.
When you upgrade one or more nodes in a cluster, but before all nodes in the cluster run the new version, the cluster enters a partially upgraded state. In this state, the cluster continues to operate as though it runs on the earlier, pre-upgraded version.
Certain restrictions and limitations apply to a cluster in a partially upgraded state. During upgrade, and while a cluster is in a partially upgraded state:
- 
Don’t enable new features. 
- 
Don’t run repairs. Before beginning the upgrade process, you should disable all automated/scheduled repairs. This includes disabling tools like Reaper, crontab, and any scripts that call nodetool repair.
- 
Don’t repair SSTables during the upgrade process. However, DataStax recommends that you run regular repairs before starting the upgrade. 
- 
Don’t add/bootstrap new nodes to the cluster or decommission any existing nodes. 
- 
Do not alter schemas for any workloads. Propagation of schema changes between mixed-version nodes can have unexpected results. Take action to prevent schema changes from occurring during the upgrade process. It’s normal for nodes on different versions to show schema disagreements during the upgrade. 
Repair considerations in mixed-version clusters
When you run repairs in a mixed-version Cassandra or DSE cluster, consider the compatibility between SSTable formats and the streaming protocol. Newer nodes usually read SSTables from older nodes and stream data to them, but the system doesn’t guarantee this behavior, especially when you upgrade across major versions.
In cases of large version jumps, differences in the SSTable format and streaming protocol might introduce compatibility and/or performance issues. To avoid data inconsistencies and repair failures, thoroughly test repair operations in mixed-version environments. Consider performing upgrades and repairs in smaller, incremental steps to reduce the likelihood of encountering cross-version issues.
Check your driver version to avoid connection failures
Check driver compatibility. Depending on the driver version, you might need to recompile your client application code.
All DSE-only drivers have reached end-of-life (EOL), and none of these drivers support HCD. You must upgrade to a newer version of a DataStax-compatible driver. For available drivers, see Cassandra drivers supported by DataStax.
During upgrades, clusters with mixed driver versions may experience driver-specific impacts. If your cluster has mixed versions, the driver negotiates the protocol version with the first host to which it connects. Some drivers automatically select a protocol version that works across nodes. To avoid driver version incompatibility during upgrades, use these workarounds:
- 
Protocol version: Set the protocol version explicitly in your application at start up. Switch to the new protocol version only after you complete the upgrade on all nodes in the cluster. 
- 
Initial contact points: Ensure that the list of initial contact points contains only hosts with the oldest DSE version or protocol version. For example, the initial contact points contain only protocol version 2. 
Testing and upgrade restrictions
| Carefully test upgrades in staging environments and on the first production node. You might find edge-case bugs during the upgrade process. Prepare to contact DataStax Support for assistance. | 
Consider the following before you begin the upgrade process:
- 
Ensure that schema versions are consistent across the cluster before starting the upgrade 
- 
Turn off snapshots during the whole upgrade process 
- 
Migrate all COMPACT STORAGEtables to standard CQL format
Prepare to upgrade
This section contains all the steps you must complete to prepare your infrastructure and clusters before beginning the actual upgrade process. You must complete these preparations for a successful migration.
Ensure that you have the following prerequisites before you begin the upgrade process:
- 
DSE 5.1.x installed and running 
- 
Adequate disk space - DataStax recommends a minimum of 50% free space. HCD requires free disk space matching your x largest STCS tables, where x is concurrent_compactors, typically 2-8.
- 
A backup of your current DSE installation 
- 
Root or sudo access to all cluster nodes 
- 
Network connectivity between all nodes 
Upgrade JDK to 11
The migration process requires JDK 11. To manually upgrade to JDK 11, follow these steps:
- 
Download JDK 11 from OpenJDK. 
- 
Follow the installation instructions for your operating system. 
- 
Set the JAVA_HOMEenvironment variable to point to the new JDK installation:export JAVA_HOME=installation_location/openjdk11 export PATH=$JAVA_HOME/bin:$PATH
- 
Verify the installation: java -version
- 
Verify that the output shows JDK 11. 
HCD-specific considerations
When migrating to HCD, note the following differences in command structure and configuration:
- 
Command Structure: HCD uses the hcd COMMANDpattern, for example,hcd nodetool,hcd cqlsh
- 
Configuration Location: HCD configuration files are in resources/cassandra/conf/instead of/etc/dse/cassandra/
- 
Log Location: HCD logs may be in the installation directory rather than /var/log/cassandra/
- 
yaml_difftool: Not available in HCD - you must manually compare configurations
- 
Service Management: HCD uses direct binary commands rather than system services 
Configuration compatibility issues
These configuration issues cause most migration failures. You must address them before starting HCD to avoid data loss and startup failures.
For most of these issues, you must set specific values in your cassandra.yaml or cassandra-env.sh when upgrading your nodes to HCD.
Token count compatibility
DSE 5.1 may use 1 token per node, the default for earlier versions, while HCD 1.1 defaults to 16 tokens. You must address this critical compatibility issue before starting HCD.
If token counts don’t match, HCD fails to start with the "Cannot change the number of tokens from X to Y" error.
Update num_tokens in the HCD cassandra.yaml file to match your DSE configuration:
# Check your DSE configuration first:
# grep "num_tokens" /path/to/dse/resources/cassandra/conf/cassandra.yaml
# Then set the same value in HCD:
num_tokens: 1  # or the value DSE usesDatacenter name compatibility
DSE may use "Cassandra" as the default datacenter name, while HCD uses "datacenter1". If datacenter names don’t match, HCD fails to start with the "Cannot start node if snitch’s data center differs from previous data center" error.
Add the following JVM option to HCD the cassandra-env.sh file to bypass datacenter name validation during startup:
# This flag bypasses datacenter name validation during startup
# Required when migrating from DSE (uses "datacenter1") to HCD (uses "Cassandra")
# Add this line to HCD resources/cassandra/conf/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -Dcassandra.ignore_dc=true"Data directory configuration
HCD has data directories commented out by default, while DSE explicitly configures them.
If you don’t uncomment and configure data directories in the HCD cassandra.yaml file, HCD may not use the existing DSE data directories, which causes data loss.
Uncomment and configure data directories in the HCD cassandra.yaml file:
# Uncomment and set these in HCD cassandra.yaml:
data_file_directories:
     - /var/lib/cassandra/data
metadata_directory: /var/lib/cassandra/metadata
commitlog_directory: /var/lib/cassandra/commitlog
hints_directory: /var/lib/cassandra/hintsAuthentication and authorization
DSE uses DSE-specific authenticator and authorizer classes, while HCD uses DataStax’s unified security classes.
HCD automatically handles this transition, but you must verify that authentication works after migration to ensure that existing users can connect and permissions are preserved.
Disable services that can cause data inconsistencies
You must disable these services during the entire upgrade process:
- 
Backups: Disable all backup services and scheduled backups. 
- 
Nodesync: Disable nodesync on all nodes. 
- 
Snapshots: Turn off all snapshot operations. 
These services may interfere with the upgrade process and cause data inconsistencies.
Pre-upgrade preparation steps
Follow these steps to prepare each node for the upgrade:
- 
Verify that the DSE version is 5.1.x: bin/dse -v
- 
Before upgrading, verify that each node has adequate free disk space. Determine the current DSE data disk space usage: sudo du -sh /var/lib/cassandra/data/ 3.9G /var/lib/cassandra/data/Determine available disk space: sudo df -hT / Filesystem Type Size Used Avail Use% Mounted on /dev/sda1 ext4 59G 16G 41G 28% /The required space depends on the compaction strategy. 
- 
Run nodetool repairto ensure that data on each replica is consistent with data on other nodes:nodetool repair -prMake sure to have run repairs recently, before starting the upgrade This step ensures data consistency across the cluster before the upgrade process begins. 
Upgrade and backup SSTables to avoid data loss
DataStax recommends running the upgradesstables command on one node at a time or, when using racks, one rack at a time.
Upgrade the SSTables on each node to ensure that all SSTables run on the current version:
nodetool upgradesstablesIf you fail to upgrade SSTables when required, you will experience a significant performance impact and increased disk usage.
| Use the  | 
If the SSTables already run on the current version, the command returns immediately and takes no action.
- 
Verify the Java runtime version and upgrade to the recommended version: java -version openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)- 
Recommended: OpenJDK 11 
 
- 
- 
Back up any customized configuration files since the new version installation may overwrite them with default values. If you backed up your installation using the instructions in Back up your existing installation, the backup includes your original configuration files. 
Migrate COMPACT STORAGE tables
Before upgrading, you must migrate all non-system tables from COMPACT STORAGE format to standard CQL table format, as HCD 1.1 does not support this deprecated format.
- 
Export your current schema: cqlsh -e 'DESCRIBE FULL SCHEMA;' > schema_file
- 
Identify tables using COMPACT STORAGE: cat schema_file | while read -d $';\n' line ; do if echo "$line"|grep 'COMPACT STORAGE' 2>&1 > /dev/null ; then TBL="`echo $line|sed -e 's|^CREATE TABLE \([^ ]*\) .*$|\1|'`" if echo "$TBL"|egrep -v '^system' 2>&1 > /dev/null; then echo "ALTER TABLE $TBL DROP COMPACT STORAGE;" >> schema-drop-list fi fi done
- 
Apply the schema changes: cqlsh -f schema-drop-listHCD will not start if tables using COMPACT STORAGEare present.
Back up your existing installation
Back up your data prior to any version upgrade.
A backup enables you to revert and restore all the data that you used in the previous version if necessary. For manual backup instructions, see Back up a tarball installation or Back up a package installation.
| Instead of manual processes, automate the management of enterprise-wide backup and restore cluster operations using Mission Control. | 
Upgrade to HCD
After completing all preparation and planning, you are ready to upgrade your clusters from DSE to HCD.
Upgrade order
Upgrade nodes in this order:
- 
Move from node to node within one rack. Advanced users can upgrade nodes in parallel if only using NetworkTopologyStrategy.
- 
Move from rack to rack within one datacenter. 
- 
Move from datacenter to datacenter within one cluster. Advanced users can upgrade datacenters in parallel if only using LOCAL_*consistency levels.
- 
Upgrade the next cluster. 
Upgrade process overview
The migration from DSE to HCD requires a complete installation and configuration of HCD on each node in your cluster. This is not a simple upgrade - it’s a full replacement of the database software while preserving your existing data.
| Each node requires a complete HCD installation and configuration before it can join the cluster. You must perform the full installation and configuration process on each node individually, following the recommended upgrade order. | 
Upgrade steps for each node
Follow these steps on each node in the recommended Upgrade order. Complete all steps for one node before moving to the next node.
- 
Flush the commit log of the current DSE installation: nodetool drain
- 
Stop the DSE service: - 
Package installations 
- 
Tarball installations 
 sudo service dse stopinstallation_location/bin/dse cassandra-stop
- 
- 
Verify the node is running on a supported platform. 
- 
Install HCD on the node. Install HCD using the same installation type as your current system (package or tarball). HCD installation places files alongside your existing DSE installation. HCD reuses your existing DSE data directories. 
- 
Configure HCD for compatibility with your DSE installation. You must complete these configuration changes before starting HCD for successful migration. - 
Update the token count to match your DSE configuration: # Check DSE token count existing_num_tokens=$(grep "num_tokens" /path/to/dse/resources/cassandra/conf/cassandra.yaml) # Update HCD cassandra.yaml to match sed -i "s/num_tokens: 16/num_tokens: ${existing_num_tokens}/" installation_location/resources/cassandra/conf/cassandra.yaml
- 
Configure data directories to use existing DSE data: # Uncomment and set in HCD cassandra.yaml: data_file_directories: - /var/lib/cassandra/data metadata_directory: /var/lib/cassandra/metadata commitlog_directory: /var/lib/cassandra/commitlog hints_directory: /var/lib/cassandra/hints
- 
Add the datacenter compatibility flag to JVM options: # Add to HCD cassandra-env.sh echo 'JVM_OPTS="$JVM_OPTS -Dcassandra.ignore_dc=true"' >> installation_location/resources/cassandra/conf/cassandra-env.sh
- 
Compare and update other configuration settings as needed: # Compare DSE and HCD cassandra.yaml files manually diff /etc/dse/cassandra/cassandra.yaml.backup installation_location/resources/cassandra/conf/cassandra.yamlCompare the backup YAML files with the upgraded YAML files manually, as the' yaml_diff' tool may not be available in HCD installations. 
 
- 
- 
Start the node and verify it’s running correctly: - 
Package installations 
- 
Tarball installations 
- 
HCD installations 
 sudo service dse startinstallation_location/bin/dse cassandrainstallation_location/bin/hcd cassandra
- 
- 
Verify that the upgraded datacenter names match the datacenter names in the keyspace schema definition: - 
Get the node’s datacenter name: installation_location/bin/hcd nodetool status | grep "Datacenter" Datacenter: datacenter-name
- 
Verify that the node’s datacenter name matches the datacenter name for a keyspace: installation_location/bin/hcd cqlsh --execute "DESCRIBE KEYSPACE keyspace-name;" | grep "replication" CREATE KEYSPACE keyspace-name WITH replication = {'class': 'NetworkTopologyStrategy, 'datacenter-name': '3'};
- 
Review the logs for warnings, errors, and exceptions: grep -w 'WARNING\|ERROR\|exception' installation_location/logs/*.logWarnings, errors, and exceptions frequently appear in the logs when starting an upgraded node. Some of these log entries provide informational help for executing specific upgrade-related steps. If you find unexpected warnings, errors, or exceptions, contact DataStax Support. The default location is /etc/hcd/hcd-env.shfor package installations andinstallation_location/bin/hcd-env.shfor tarball installations. You can change the location inhcd-env.sh.
 
- 
- 
After successfully upgrading and verifying one node, repeat the entire process (Steps 1-4) on the next node in the cluster following the recommended Upgrade order. Complete the full installation and configuration process on each node individually. Do not attempt to upgrade multiple nodes simultaneously unless you are an advanced user with specific requirements. 
Post-upgrade
This section contains steps to verify and complete the migration process after you upgrade all nodes.
Complete SSTable upgrade after cluster migration
After you complete the entire cluster upgrade, run upgradesstables on one node at a time, or one rack at a time when using racks.
If you fail to upgrade SSTables, you will experience a significant performance impact, increased disk usage, and possible data loss.
installation_location/bin/hcd nodetool upgradesstables| Use the  | 
Migration success verification
After completing the migration, verify that HCD is running correctly:
# Check node status
./bin/hcd nodetool status
# Verify all keyspaces are accessible
./bin/hcd cqlsh --execute "DESCRIBE KEYSPACES;"
# Confirm data integrity
./bin/hcd cqlsh --execute "SELECT count(*) FROM system.local;"
# Check that DSE-specific keyspaces are preserved
./bin/hcd cqlsh --execute "SELECT keyspace_name FROM system_schema.keyspaces WHERE keyspace_name LIKE 'dse_%';"Use the following checklist to verify migration success:
- 
HCD node status shows UN(Up/Normal)
- 
All original keyspaces are accessible 
- 
Data load matches pre-migration values 
- 
Host ID and tokens are preserved 
- 
You have upgraded SSTables 
- 
Authentication and authorization work correctly 
- 
No critical errors in logs 
If any of these checks fail, see Troubleshoot migration.
Troubleshoot migration
This section contains common issues and their solutions that you may encounter during the migration process.
Cannot change the number of tokens
- 
Cause: Token count mismatch between DSE and HCD configurations 
- 
Solution: Update num_tokensin HCDcassandra.yamlto match DSE configuration
Cannot start node if snitch’s data center differs
- 
Cause: Datacenter name mismatch between DSE and HCD 
- 
Solution: Add -Dcassandra.ignore_dc=trueto JVM options incassandra-env.sh
Connection refused
- 
Cause: HCD failed to start due to configuration issues 
- 
Solution: Check logs for specific error messages and address configuration compatibility issues 
Log analysis
Check HCD logs for detailed error information:
# Check system logs
tail -50 /var/log/cassandra/system.log | grep -E "(ERROR|WARN|Exception|Failed|Cannot start|Fatal)"
# Check debug logs for more details
tail -50 /var/log/cassandra/debug.log | grep -E "(ERROR|WARN|Exception|Failed)"