Migrate Cassandra clusters to Mission Control
This guide shows how to migrate your Apache Cassandra® cluster to Mission Control, ensuring zero downtime during the transition. Benefits include the following:
-
Minimal configuration changes.
-
Zero downtime.
-
Keeping nodes on the cluster in sync during the migration.
-
Use of Mission Control for next generation cluster management.
Prerequisites
Before starting, confirm that you have the following:
-
Access to an existing Cassandra cluster
-
Access to Mission Control
-
Basic understanding of Cassandra and its tools like
nodetool
andcqlsh
Create a new cluster in Mission Control
-
In the Mission Control UI, go to Clusters, and then click Create Cluster.
-
For Cluster Name, enter the same name as your existing cluster.
-
For Server Version, select the Cassandra version to match your current version.
-
Enter the Datacenter Name, and then choose the rack configuration.
-
Specify the node RAM, Storage Class, and Storage Amount.
-
Create a Superuser Password for the new cluster and disable internode encryption to match the current cluster’s configuration, if necessary.
-
Set the Heap Amount.
-
Enter your External Datacenter Name. This is the name of the datacenter in your existing cluster.
-
For Seed Name, specify the routable IP address of a node from your existing cluster as the seed node for the new datacenter.
-
Click Create Cluster.
-
Verify that the cluster starts and its nodes come online.
Verify new cluster nodes
Once the new cluster starts, run nodetool status
on one of your original nodes to confirm that the new cluster nodes are running.
You should see your original datacenter and the new datacenter with the corresponding nodes.
Update replication strategy
To ensure that data is replicated across both datacenters:
-
Run
nodetool describe cluster
to check the current cluster configuration. -
Confirm that the
system_auth
,system_traces
, andsystem_distributed
system keyspaces use theNetworkTopologyStrategy
and have replication set up for both datacenters. -
In your cloud provider, search for one of your nodes, and then expose it on port 9042 using a LoadBalancer service.
-
Connect to the node’s load balancer IP address using the
cqlsh
command.You must use the default Cassandra user to connect to the node.
-
Modify the replication strategy for your keyspace to replicate across both datacenters:
ALTER KEYSPACE KEYSPACE WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'ORIGINAL_DATACENTER' : 3, 'NEW_DATACENTER' : 3 };
Replace the following:
-
KEYSPACE
: The name of your keyspace -
ORIGINAL_DATACENTER
: The name of the original datacenter -
NEW_DATACENTER
: The name of the new datacenter
-
-
Modify the replication strategy for system_auth:
The new cluster is now ready to receive data from the old cluster.
Rebuild data across datacenters
To move data from the old cluster to the new one, you must run a nodetool rebuild
on each node in the new cluster.
You can run a task in Kubernetes to trigger the rebuild process.
-
In a text editor, create a new YAML file with the following content:
apiVersion: control.k8ssandra.io/v1alpha1 kind: CassandraTask metadata: name: rebuild-DATACENTER namespace: NAMESPACE spec: dataCenter: name: CLUSTERNAME-DATACENTER namespace: NAMESPACE jobs: - name: rebuild-DATACENTER command: rebuild args: source_datacenter: SOURCE_DATACENTER
Replace the following:
-
DATACENTER
: The name of the datacenter in the new cluster -
NAMESPACE
: The namespace where the new cluster is running -
CLUSTERNAME
The name of the new cluster -
SOURCE_DATACENTER
: The name of the datacenter in the old cluster
-
-
Apply the task with
kubectl
:kubectl apply -f rebuild-DATACENTER.yaml --namespace NAMESPACE
Replace the following:
-
DATACENTER
: The name of the datacenter in the new cluster -
NAMESPACE
: The namespace where the new cluster is running
-
-
View the task status:
kubectl describe CassandraTask rebuild-DATACENTER --namespace NAMESPACE
Replace the following:
-
DATACENTER
: The name of the datacenter in the new cluster -
NAMESPACE
: The namespace where the new cluster is runningOnce the task completes, use
cqlsh
to sign in with a non-default Cassandra user.You can now verify that the data is replicated across both datacenters.
-
Verify data replication
To verify that the data is replicated across both datacenters:
-
Use
cqlsh
to connect to a node in the new datacenter. -
Run a
SELECT
query on a table in your keyspace to verify that the data is present:SELECT * FROM KEYSPACE.TABLE LIMIT 10;
Replace the following:
-
KEYSPACE
: The name of your keyspace -
TABLE
: The name of your tableYou should see the data from the old cluster in the new cluster.
-
Adjust replication factor
Adjust the replication factor for all keyspaces, including system keyspaces, to remove references to your original datacenter and retain the new datacenter as the sole replica target.
-
Adjust the replication factor for your keyspace:
ALTER KEYSPACE KEYSPACE WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'NEW_DATACENTER' : 3 };
Replace the following:
-
KEYSPACE
: The name of your keyspace -
NEW_DATACENTER
: The name of the new datacenter
-
-
Adjust the replication factor for Reaper:
ALTER KEYSPACE reaper_db WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'NEW_DATACENTER' : 3 };
Replace
NEW_DATACENTER
with the name of the new datacenter. -
Adjust the replication factor for the
system_distributed
keyspace:ALTER KEYSPACE system_distributed WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'NEW_DATACENTER' : 3 };
Replace
NEW_DATACENTER
with the name of the new datacenter. -
Adjust the replication factor for the
system_traces
keyspace:ALTER KEYSPACE system_traces WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'NEW_DATACENTER' : 3 };
Replace
NEW_DATACENTER
with the name of the new datacenter. -
Adjust the replication factor for the
system_auth
keyspace:ALTER KEYSPACE system_auth WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'NEW_DATACENTER' : 3 };
Replace
NEW_DATACENTER
with the name of the new datacenter.
Decommission old nodes
Use the nodetool decommission
command to remove old nodes from the old datacenter:
-
SSH into one of the old nodes in the original datacenter.
-
Use the nodetool decommission command to begin removing old nodes from the old datacenter:
bin/nodetool decommission --force
-
Verify that the node is decommissioned:
bin/nodetool status
The node should no longer appear in the list of nodes in the cluster for the old datacenter.
-
Run the following command to kill the Cassandra process on the old node:
kill cat cassandra.pid
-
Verify that the file no longer exists:
ps -ef | grep cassandra
-
Repeat the process for each node in the old datacenter.
View alerts in Mission Control
There are several users created automatically when you create a new cluster with Mission Control, including the superuser, Medusa, and Reaper users. However, if the cluster already exists, you must recreate the users and roles to enable the Reaper service.
-
Open the Mission Control UI and go to the Clusters page.
-
Click the bell icon, and then click View Alerts. There are active alerts for unavailable replicas and unhealthy pods. This is expected because the Reaper service cannot function with the updated cluster configuration.
Recreate users to enable the Reaper service
To recreate the Medusa and Reaper users in the new cluster, do the following:
-
Get the username and password for Reaper from the secrets:
kubectl -n PROJECT_SLUG get secret SECRET_NAME -o jsonpath='{.data.username}' | base64 -d && echo kubectl -n PROJECT_SLUG get secret SECRET_NAME -o jsonpath='{.data.password}' | base64 -d && echo
Replace the following:
-
PROJECT_SLUG
: The namespace where the new cluster is running -
SECRET_NAME
: The name of the secret containing the Reaper username and password
-
-
Get the username and password for the Medusa user:
kubectl -n PROJECT_SLUG get secret SECRET_NAME -o jsonpath='{.data.username}' | base64 -d && echo kubectl -n PROJECT_SLUG get secret SECRET_NAME -o jsonpath='{.data.password}' | base64 -d && echo
Replace the following:
-
PROJECT_SLUG
: The namespace where the new cluster is running -
SECRET_NAME
: The name of the secret containing the Medusa username and passwordYou can now create Medusa and Reaper roles and permissions in the new cluster.
-
-
Use
cqlsh
to opensystem_auth
:use system_auth;
-
View the existing roles:
SELECT * FROM roles;
-
Create the Reaper role:
CREATE ROLE 'REAPER_USERNAME' WITH superuser=true AND LOGIN=true AND PASSWORD='REAPER_PASSWORD';
Replace the following:
-
REAPER_USERNAME
: The base64-decoded Reaper username -
REAPER_PASSWORD
: The base64-decoded Reaper password
-
-
Create the Medusa role:
CREATE ROLE 'MEDUSA_USERNAME' WITH superuser=true AND LOGIN=true AND PASSWORD='MEDUSA_PASSWORD';
Replace the following:
-
MEDUSA_USERNAME
: The base64-decoded Medusa username -
MEDUSA_PASSWORD
: The base64-decoded Medusa password
-
-
View the roles to verify that the new roles are created:
SELECT * FROM roles;
-
Verify the settings in your reaper_db keyspace to ensure that the Reaper service can function with the updated cluster configuration:
use reaper_db; desc KEYSPACE;
The output is similar to the following:
Sample results
CREATE KEYSPACE reaper_db WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'NEW_DATACENTER': 3} AND DURABLE_WRITES = true;
If the output still includes the old datacenter, update the keyspace to remove the reference to the old datacenter:
ALTER KEYSPACE reaper_db WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'NEW_DATACENTER': 3} AND DURABLE_WRITES = true;
Replace
NEW_DATACENTER
with the name of the new datacenter. -
Scale down the Reaper deployment to zero replicas, then scale it back up to one replica:
kubectl scale deployment cassandra-reaper --replicas=0 kubectl scale deployment cassandra-reaper --replicas=1
You can use kubectl or your cloud provider’s UI to scale the pods down to zero replicas and then back up to one replica.
-
Verify that the Reaper pod restarts:
kubectl get pods -n NAMESPACE
Replace
NAMESPACE
with the namespace where the new cluster is running. -
After the Reaper pod restarts, verify that repairs can be run. Check the Repairs tab in Mission Control and confirm that available keyspaces appear in the list.
Run a repair
To synchronize the cluster and ensure that the data is consistent across all nodes, do the following:
-
In the Mission Control UI, go to the Repairs tab, and then click Run repair.
-
In the Start a repair dialog, select the Keyspace you want to repair.
-
Keep the default settings, and then click Run. The repair process starts.
-
Monitor the repair process in the Repairs tab.
-
After the repair completes, verify that the data is consistent across all nodes.
Clean up and verify
Once the repair completes, ensure that the cluster is functioning correctly. After the migration, verify that the cluster is functioning correctly by checking for alerts in Mission Control related to unavailable replicas or unhealthy pods.