Zero Downtime Migration (ZDM) tools
DataStax Zero Downtime Migration (ZDM) tools includes ZDM Proxy, ZDM Utility, ZDM Proxy Automation, and several data migration tools:
-
ZDM Proxy: Orchestrates activity-in-transition on your clusters during a live, zero-downtime migration. Your client applications connect to ZDM Proxy in the same way they would connect to a cluster, and then ZDM Proxy routes requests to the appropriate cluster.
-
ZDM Utility and ZDM Proxy Automation: Facilitate the deployment and management of ZDM Proxy instances.
-
Data migration tools: Used to copy and validate data. You can use these tools alone or with ZDM Proxy.
Benefits of using the ZDM tools
There are many benefits to using the ZDM tools for your migration:
- Minimal client code changes
-
Depending on cluster compatibility, the ZDM tools help you migrate to a new or upgraded database platform with minimal changes to your client application code. In some cases, you only need to change the connection string to point to the new cluster at the end of the migration process. Typically, these changes are minimal and non-invasive, especially if your client application uses an externalized property configuration for contact points.
- Real-time data consistency
-
ZDM Proxy orchestrates real-time activity generated by your client applications, ensuring data consistency while you replicate, validate, and test your existing data on the new cluster. Once you set up ZDM Proxy, the dual-writes feature ensures that new writes are sent to both the origin and target clusters, so you can focus on migrating the data that was present before initializing ZDM Proxy.
- Safely test the new cluster under full production workloads
-
In addition to the dual-writes feature, you can optionally enable asynchronous dual-reads to test the target cluster’s ability to handle a production workload before you permanently switch to the target cluster at the end of the migration process.
Client applications aren’t interrupted by read errors or latency spikes on the new, target cluster. Although these errors and metrics are received by ZDM Proxy for monitoring and performance benchmarking purposes, they aren’t propagated back to the client applications.
From the client side, traffic is seamless and uninterrupted during the entire migration process.
- Seamless rollback without data loss
-
If there is a problem during the migration, you can rollback to the original cluster without any data loss or interruption of service. You can allow ZDM Proxy to continue orchestrating dual-writes, or redirect your client applications back to the origin cluster and disable ZDM Proxy.
- Endless validation and testing time
-
Because your client applications remain fully operational during the migration, and your clusters are kept in sync by ZDM Proxy, you can take as much time as you need to validate and test the target cluster before switching over permanently.
- Supports upgrades and migrations
-
The ZDM tools support migrations between different Cassandra-based platforms, such as open-source Apache Cassandra® to Astra, as well as major version upgrades of the same platform, such as DSE 5.0 to DSE 6.9.
Using the ZDM tools for upgrades reduces the risk of data loss or corruption due to breaking changes between versions, provides a seamless rollback option, and streamlines the upgrade process, eliminating the need for interim upgrades and progressive manual reconfiguration. Whenever possible, DataStax recommends using the ZDM process to orchestrate live migrations between separate clusters.
- Avoids traditional migration challenges
-
Without the ZDM tools, migrating client applications between clusters involves granular and intrusive client application code changes, extensive migration preparation, and a window of downtime for the client application’s end users.
With the ZDM tools, you can migrate your client applications and data between Cassandra-based clusters with minimal code changes and no interruption of service. You can have the confidence that you are using tools designed specifically to handle the complexities of live traffic during large enterprise migrations.
- Independent cluster configuration
-
The ZDM tools don’t replicate your cluster configuration or require your clusters to have the same configuration or database platform. Instead, the ZDM tools orchestrate live traffic between your existing cluster and a new, separate cluster while you use a data migration tool to replicate and validate data on the new cluster.
Each cluster can have its own configuration, as long as they share a common protocol version, and the same read and write requests can be executed on both clusters without error. For more information, see Compatibility requirements for ZDM Proxy.
ZDM Proxy
The main component of the ZDM tools is ZDM Proxy, which is designed to be a lightweight proxy that handles all real-time requests generated by your client applications during the migration process.
Generally speaking, a proxy is a software class functioning as an interface to any other component, connection, or resource, such as a network connection, a server, a large object in memory, or a file. The proxy is a wrapper or agent object that a client calls to access the actual object served through the proxy.
ZDM Proxy is an orchestrator for monitoring application activity and keeping multiple clusters (databases) in sync through dual writes. Your client applications connect to ZDM Proxy in the same way they would connect to a cluster, including encrypted connections. Then, ZDM Proxy routes requests to one or both clusters, depending on the request type and your ZDM Proxy configuration. ZDM Proxy
ZDM Proxy isn’t linked to the actual migration process. It doesn’t perform data migrations, and it doesn’t have awareness of ongoing migrations. Instead, you use a data migration tool to perform the data migration and validate migrated data.
ZDM Proxy reduces risks to upgrades and migrations by decoupling the origin (source) cluster from the target (destination) cluster and maintaining consistency between both clusters. You decide when you want to switch permanently to the target cluster.
After migrating your data, changes to your application code are usually minimal, depending on your client’s compatibility with the origin and target clusters. Typically, you only need to update the connection string.
This tool is free, open-source software.
|
Don’t deploy ZDM Proxy as a sidecar. For more information, see Choosing where to deploy the proxy. |
How ZDM Proxy handles reads and writes
DataStax created ZDM Proxy to orchestrate requests between a client application and both the origin and target clusters. These clusters can be any data store that supports the Cassandra Query Language (CQL), such as DataStax Enterprise (DSE), Hyper-Converged Database (HCD), Astra DB, and open-source Apache Cassandra®.
During the migration process, you designate one cluster as the primary cluster, which serves as the source of truth for reads. For the majority of the migration process, this is typically the origin cluster. Towards the end of the migration process, when you are ready to read exclusively from your target cluster, you set the target cluster as the primary cluster.
The other cluster is referred to as the secondary cluster. While ZDM Proxy is active, write requests are sent to both clusters to ensure data consistency, but only the primary cluster serves read requests.
Writes (dual-write logic)
ZDM Proxy sends every write operation (INSERT, UPDATE, DELETE) synchronously to both clusters at the client application’s requested consistency level:
-
If the write is acknowledged in both clusters at the requested consistency level, then the operation returns a successful write acknowledgement to the client that issued the request.
-
If the write fails in either cluster, then ZDM Proxy passes a write failure, originating from the primary cluster, back to the client. The client can then retry the request, if appropriate, based on the client’s retry policy.
This design ensures that new data is always written to both clusters, and that any failure on either cluster is always made visible to the client application.
For information about how ZDM Proxy handles Lightweight Transactions (LWTs), see Lightweight Transactions and the applied flag.
Reads
By default, ZDM Proxy sends all reads to the primary cluster, and then returns the result to the client application.
If you enable asynchronous dual reads, ZDM Proxy sends asynchronous read requests to the secondary cluster (typically the target cluster) in addition to the synchronous read requests that are sent to the primary cluster.
This feature is designed to test the target cluster’s ability to handle a production workload before you permanently switch to the target cluster at the end of the migration process.
With or without asynchronous dual reads, the client application only receives results from synchronous reads on the primary cluster. The results of asynchronous reads aren’t returned to the client because asynchronous reads are for testing purposes only.
For more information, see Phase 3: Enable asynchronous dual reads.
Consistency levels
ZDM Proxy doesn’t directly manage or track consistency levels. Instead, it passes the requested consistency level from the client application to each cluster (origin and target) when routing requests.
For reads, the consistency level is always passed to the primary cluster, which always receives read requests. The request is then executed within the primary cluster at the requested consistency level.
If asynchronous dual reads are enabled, the consistency level is passed to both clusters, and each cluster executes the read request at the requested consistency level independently. If the request fails to attain the required quorum on the primary cluster, the failure is returned to the client application as normal. However, failure of asynchronous reads on the secondary cluster are logged but not returned to the client application.
For writes, the consistency level is passed to both clusters, and each cluster executes the write request at the requested consistency level independently. If either request fails to attain the required quorum, the failure is returned to the client application as normal.
If either cluster is an Astra DB database, be aware that CL.ONE isn’t supported by Astra DB.
Requests sent with CL.ONE to Astra DB databases always fail.
ZDM Proxy doesn’t mute these failures because you need to be aware of them.
You must adapt your client application to use a consistency level that is supported by both clusters to ensure that the migration is seamless and error-free.
Timeouts and connection failures
When requests are routed through ZDM Proxy, there is a proxy-side timeout and application-side timeout.
If a response isn’t received within the timeout period (zdm_proxy_request_timeout_ms), nothing is returned to the request handling thread, and, by extension, no response is sent to the client.
This inevitably results in a client-side timeout, which is an accurate representation of the fact that at least one cluster failed to respond to the request.
The clusters that are required to respond depend on the type of request and whether asynchronous dual reads are enabled.
High availability and multiple ZDM Proxy instances
ZDM Proxy is designed to be highly available and run a clustered fashion to avoid a single point of failure.
With the exception of local test environments, DataStax recommends that all ZDM Proxy deployments have multiple ZDM Proxy instances. Deployments typically consist of three or more instances.
|
Throughout the ZDM documentation, the term ZDM Proxy deployment refers to the entire deployment, and ZDM Proxy instance refers to an individual proxy process in the deployment. |
You can scale ZDM Proxy instances horizontally and vertically. To avoid downtime when applying configuration changes, you can perform a rolling restart of your ZDM Proxy instances.
For simplicity, you can use ZDM Utility and ZDM Proxy Automation to set up and run Ansible playbooks that deploy and manage ZDM Proxy and its monitoring stack.
ZDM Utility and ZDM Proxy Automation
You can use ZDM Utility and ZDM Proxy Automation to set up and run Ansible playbooks that deploy and manage multiple ZDM Proxy instances and the associated monitoring stack (Prometheus metrics and Grafana visualizations):
- Ansible
-
Ansible is a suite of software tools that enable infrastructure as code. It is open source, and its capabilities include software provisioning, configuration management, and application deployment.
- Playbooks
-
Ansible playbooks streamline and automate the deployment and management of ZDM Proxy instances and their monitoring components. Playbooks are YAML files that define a series of tasks to be executed on one or more remote machines, including installing software, configuring settings, and managing services. They are repeatable and reusable, and they simplify deployment and configuration management because each playbook focuses on a specific operation, such as rolling restarts.
- Control host
-
You run playbooks from a centralized machine known as the Ansible Control Host. ZDM Utility, which is included with ZDM Proxy Automation, creates the Docker container that acts as the Ansible Control Host.
To use ZDM Utility and ZDM Proxy Automation, you must prepare the recommended infrastructure. For more information about the role of Ansible and Ansible playbooks in the ZDM process, see Set up ZDM Proxy Automation with ZDM Utility and Deploy ZDM Proxy.
DataStax strongly recommends using ZDM Utility and ZDM Proxy Automation, but these components aren’t required to use ZDM Proxy. In general, this documentation assumes you are using ZDM Proxy with ZDM Utility and ZDM Proxy Automation. Although some sections refer to non-automated ZDM Proxy deployments, complete instructions for non-automated deployments aren’t provided.
These tools are free, open-source software.
Data migration tools
You use data migration tools to perform bulk writes, copy data from one cluster to another, and validate migrated data.
You can use data migration tools alone or with ZDM Proxy. For example, if you want to bulk load data from an existing, inactive cluster that is not receiving live application traffic, you don’t need ZDM Proxy.
The appropriate data migration tool depends on your use case, including the schema, format, and data validation requirements. For recommended and alternative tools, see Phase 2: Migrate and validate data.