Frequently Asked Questions
If you’re new to the DataStax Zero Downtime Migration features, these FAQs are for you.
What is meant by Zero Downtime Migration?
Zero Downtime Migration (ZDM) means the ability for you to reliably migrate client applications and data between CQL clusters with no interruption of service.
ZDM lets you accomplish migrations without the need to change your client application code, and with only minimal configuration changes. While in some cases you may need to make some minor changes at the client application level, these changes will be minimal and non-invasive, especially if your client application uses an externalized property configuration for contact points.
The suite of ZDM tools enables you to migrate the real-time activity generated by your client applications, as well as transfer your existing data, always with a simple rollback strategy that does not require any downtime.
It is important to note that the Zero Downtime Migration process requires you to be able to perform rolling restarts of your client applications during the migration.
In the context of migrating between clusters (client applications and data), the examples in this guide sometimes refer to the migration to our cloud-native database environment, DataStax Astra DB. However, it is important to emphasize that the ZDM Proxy can be freely used to migrate without downtime between any combination of CQL clusters of any type. In addition to Astra DB, examples include Apache Cassandra® or DataStax Enterprise (DSE). |
Can you illustrate the overall workflow and phases of a migration?
See the diagrams of the ZDM migration phases.
Do you offer an interactive self-guided lab to help me learn about ZDM migrations at my own pace?
Yes! Here’s a fun way to learn.
Now that you’ve seen a conceptual overview of the process, let’s put what you learned into practice.
We’ve built a complementary learning resource that is a companion to this comprehensive ZDM documentation. It’s the Zero Downtime Migration Interactive Lab.
-
All you need is a browser and a GitHub account.
-
There’s nothing to install for the lab, which opens in a pre-configured GitPod environment.
-
You’ll learn about a full migration without leaving your browser!
To run the lab, all major browsers are supported, except Safari. For more, see the lab’s start page. |
We encourage you to explore this free hands-on interactive lab from DataStax Academy. It’s an excellent, detailed view of the migration process. The lab describes and demonstrates all the steps and automation performed to prepare for, and complete, a migration from any Cassandra/DSE/Astra DB database to another Cassandra/DSE/Astra DB database across clusters.
The interactive lab spans the pre-migration prerequisites and each of the five key migration phases.
What components are provided with ZDM?
DataStax Zero Downtime Migration includes the following:
-
ZDM Proxy is a service that operates between Origin, which is your existing cluster, and Target, which is the cluster to which you are migrating.
-
ZDM Proxy Automation is an Ansible-based tool that allows you to deploy and manage the ZDM Proxy instances and associated monitoring stack. To simplify its setup, the suite includes the ZDM Utility. This interactive utility creates a Docker container acting as the Ansible Control Host. The Ansible playbooks constitute the ZDM Proxy Automation.
-
Cassandra Data Migrator is designed to:
-
Connect to your clusters and compare the data between Origin and Target.
-
Report differences in a detailed log file.
-
Reconcile any missing records and fix any data inconsistencies between Origin and Target by enabling
autocorrect
in a configuration file.
-
-
DSBulk Migrator is provided to migrate smaller amounts of data from Origin to Target.
-
Well-defined steps in this migration documentation, organized as a sequence of phases.
What exactly is ZDM Proxy?
ZDM Proxy is a component designed to seamlessly handle the real-time client application activity while a migration is in progress. See here for an overview.
What are the benefits of Zero Downtime Migration and its use cases?
Migrating client applications between clusters is a need that arises in many scenarios. For example, you may want to:
-
Move to a cloud-native, managed service such as Astra DB.
-
Migrate your client application to a brand new cluster, on a more recent version and perhaps on new infrastructure, or even a different CQL database entirely, without intermediate upgrade steps and ensuring that you always have an easy way to rollback in case of issues.
-
Separate out a client application from a shared cluster to a dedicated one.
-
Consolidate client applications, currently running on separate clusters, into fewer clusters or even a single one.
Bottom line: You want to migrate your critical database infrastructure without risk or concern that your users' experiences will be affected.
Which releases of Cassandra or DSE are supported for migrations?
Overall, you can use ZDM Proxy to migrate:
-
From: Any Cassandra 2.1.6 or higher release, or from any DSE 4.7.1 or higher release.
-
To: Any equivalent or higher release of Cassandra, or to any equivalent or higher release of DSE, or to Astra DB.
There are many reasons why you may decide to migrate your data and client applications from one cluster to another, for example:
-
Moving to a different type of CQL database, for example an on-demand cloud-based proposition such as Astra DB.
-
Upgrading a cluster to a newer version, or newer infrastructure, in as little as one step while leaving your existing cluster untouched throughout the process.
-
Moving one or more client applications out of a shared cluster and onto a dedicated one, in order to manage and configure each cluster independently.
-
Consolidating client applications, which may be currently running on separate clusters, onto a shared one in order to reduce overall database footprint and maintenance overhead.
Here are just a few examples of migration scenarios that are supported when moving from one type of CQL-based database to another:
-
From an existing self-managed Cassandra or DSE cluster to cloud-native Astra DB. For example:
-
Cassandra 2.1.6+, 3.11.x, 4.0.x, or 4.1.x to Astra DB.
-
DSE 4.7.1+, 4.8.x, 5.1.x, 6.7.x or 6.8.x to Astra DB.
-
-
From an existing Cassandra or DSE cluster to another Cassandra or DSE cluster. For example:
-
Cassandra 2.1.6+ or 3.11.x to Cassandra 4.0.x or 4.1.x.
-
DSE 4.7.1+, 4.8.x, 5.1.x or 6.7.x to DSE 6.8.x.
-
Cassandra 2.1.6+, 3.11.x, 4.0.x, or 4.1.x to DSE 6.8.x.
-
DSE 4.7.1+ or 4.8.x to Cassandra 4.0.x or 4.1.x.
-
-
From Astra DB Classic to Astra DB Serverless.
-
From any CQL-based database type/version to the equivalent CQL-based database type/version.
Does ZDM migrate clusters?
ZDM does not migrate clusters. With ZDM, we are migrating data and applications between clusters. At the end of the migration, your application will be running on your new cluster, which will have been populated with all the relevant data.
What challenges does ZDM solve?
Before DataStax Zero Downtime Migration was available, migrating client applications between clusters involved granular and intrusive client application code changes, extensive migration preparation, and a window of downtime to the client application’s end users.
ZDM allows you to leverage mature migration tools that have been used with large scale enterprises and applications to make migrations easy and transparent to end users.
What is the pricing model?
The suite of Zero Downtime Migration tools from DataStax is free and open-sourced.
Is there support available if I have questions or issues during our migration?
ZDM Proxy and related software tools in the migration suite include technical assistance by DataStax Support for DSE and Luna subscribers, and Astra DB users who are on an Enterprise plan. Free and Pay As You Go plan users do not have support access and must raise questions in the Astra Portal chat. Luna is a subscription to the Apache Cassandra support and expertise at DataStax.
For any observed problems with the ZDM Proxy, submit a GitHub Issue in the ZDM Proxy GitHub repo.
Additional examples serve as templates, from which you can learn about migrations. DataStax does not assume responsibility for making the templates work for specific use cases.
Where are the public GitHub repos?
All the DataStax Zero Downtime Migration GitHub repos are public and open source. You are welcome to read the code and submit feedback via GitHub Issues per repo. In addition to sending feedback, you may submit Pull Requests (PRs) for potential inclusion.
To submit PRs, you must for first agree to the DataStax Contribution License Agreement (CLA).
-
ZDM Proxy repo for ZDM Proxy.
-
ZDM Proxy Automation repo for the Ansible-based ZDM Proxy Automation, which includes the ZDM Utility.
-
cassandra-data-migrator repo for the tool that supports migrating larger data quantities as well as detailed verifications and reconciliation options.
-
dsbulk-migrator repo for the tool that allows simple data migrations without validation and reconciliation capabilities.
Does ZDM Proxy support Transport Layer Security (TLS)?
Yes, and here’s a summary:
-
For application-to-proxy TLS, the application is the TLS client and the ZDM Proxy is the TLS server. One-way TLS and Mutual TLS are both supported.
-
For proxy-to-cluster TLS, the ZDM Proxy acts as the TLS client and the cluster as the TLS server. One-way TLS and Mutual TLS are both supported.
-
When the ZDM Proxy connects to Astra DB clusters, it always implicitly uses Mutual TLS. This is done through the Secure Connect Bundle (SCB) and does not require any extra configuration.
For TLS details, see Configure Transport Layer Security (TLS).
How does ZDM Proxy handle Lightweight Transactions (LWTs)?
ZDM Proxy handles LWTs as write operations.
The proxy sends the LWT to Origin and Target clusters concurrently, and waits for a response from both.
ZDM Proxy will return a success
status to the client if both Origin and Target send successful acknowledgements, or otherwise will return a failure
status if one or both do not return an acknowledgement.
What sets LWTs apart from regular writes is that they are conditional. For important details, including the client context for a returned applied
flag, see Lightweight Transactions and the applied
flag.
Can ZDM Proxy be deployed as a sidecar?
ZDM Proxy should not be deployed as a sidecar.
ZDM Proxy was designed to mimic a Cassandra cluster. For this reason, we recommend deploying multiple ZDM Proxy instances, each running on a dedicated machine, instance, or VM.
For best performance, this deployment should be close to the client applications (ideally on the same local network) but not co-deployed on the same machines as the client applications.
This way, each client application instance can connect to all ZDM Proxy instances, just as it would connect to all nodes in a Cassandra cluster (or datacenter).
This deployment model gives maximum resilience and failure tolerance guarantees and allows the client application driver to continue using the same load balancing and retry mechanisms that it would normally use.
Conversely, deploying a single ZDM Proxy instance would undermine this resilience mechanism and create a single point of failure, which could affect the client applications if one or more nodes of the underlying clusters (Origin or Target) go offline. In a sidecar deployment, each client application instance would be connecting to a single ZDM Proxy instance, and would therefore be exposed to this risk.
For more information, see Choosing where to deploy the proxy.
What are the benefits of using a cloud-native database?
When moving your client applications and data from on-premise Cassandra Query Language (CQL) based data stores (Cassandra or DSE) to a cloud-native database (CNDB) like Astra DB, it’s important to acknowledge the fundamental differences ahead.
With on-premise infrastructure, you have total control of the datacenter’s physical infrastructure, software configurations, and your custom procedures. At the same time, with on-premise clusters you take on the cost of infrastructure resources, maintenance, operations, and personnel.
Ranging from large enterprises to small teams, IT managers, operators, and developers are realizing that the Total Cost of Ownership (TCO) with cloud solutions is much lower than continuing to run on-prem physical data centers.
A CNDB like Astra DB is a different environment. Running on proven cloud providers like AWS, Google Cloud, and Azure, Astra DB greatly reduces complexity and increases convenience by surfacing a subset of configurable settings, providing a well-designed UI known as Astra Portal, plus a set of APIs and commands to interact with your Astra DB organizations and databases.