Prepare to use Astra DB Sideloader

Before you use Astra DB Sideloader, review the requirements and prepare your target database, origin cluster, and administration server.

Due to the nature of the Astra DB Sideloader process and the tools involved, you need to be familiar with using the command line, including the following:

  • Installing and using CLI tools

  • Issuing curl commands

  • Basic scripting

  • Modifying example commands to fit your environment

  • Security best practices

The Astra DB Sideloader process uses authentication credentials to write to the migration directory and your database.

Make sure you understand how to securely store and use sensitive credentials when working on the command line.

Target Astra DB database requirements

  • Your Astra organization must be on an Enterprise subscription plan.

    Astra DB Sideloader is a premium feature that incurs costs based on usage:

    • Total amount (GB) of data processed as part of the Astra DB Sideloader workload.

    • The amount of data stored in the migration bucket is metered at the standard Astra DB storage rate.

    For more information and specific rates, see the Astra Pricing page.

    Migration directories are automatically cleaned up after one week of idle time.

    To minimize costs, you can manually clean up migration directories when you no longer need them.

  • Your target database must be an Astra DB Serverless database.

    If you don’t already have one, create a database. You can use either a Serverless (Non-Vector) or Serverless (Vector) database.

    Serverless (Vector) databases can store both vector and non-vector data.

  • Your target database must be in a Provisioned Capacity Unit (PCU) group. You can use either a flexible capacity PCU group or a committed capacity PCU group, depending on your long-term needs and other PCU group usage.

    • Flexible capacity PCU group

    • Committed capacity PCU group

    Because Astra DB Sideloader operations are typically short-term, resource-intensive events, you can create a flexible capacity PCU group exclusively to support your target database during the migration.

    DataStax recommends the following flexible capacity PCU group configuration for Astra DB Sideloader migrations. For instructions, see Create a flexible capacity PCU group.

    • Target database is a Serverless (Non-Vector) database

    • Target database is a Serverless (Vector) database

    • Minimum capacity: One or more, depending on the scale of the migration.

    • Maximum capacity: Greater than the minimum by several units to allow autoscaling during resource intensive stages of the migration.

      For non-trivial migrations, consider setting the maximum to 10. For extremely large migrations, contact your DataStax account representative or DataStax Support to request more than 10 units to support your migration.

    By default, Serverless (Vector) databases can have no more than one unit per PCU group. For any non-trivial migration, contact your DataStax account representative or DataStax Support for assistance configuring a PCU group for your target Serverless (Vector) database.

    After the migration, you can move your target database out of the flexible capacity PCU group, and then park or delete the group. Don’t park the PCU group during the Astra DB Sideloader process because databases in a parked PCU group are hibernated and unavailable for use.

    If you plan to keep your target database in a PCU group after the migration, you can create a committed capacity PCU group for your target database.

    The Astra DB Sideloader process can be extremely resource intensive. If there are any other databases in the same PCU group, the migration process can affect their performance due to resource contention.

    If your PCU groups have multiple databases, consider using a flexible capacity PCU group to temporarily isolate your target database during the migration.

    DataStax recommends the following committed capacity PCU group configuration for Astra DB Sideloader migrations. For instructions, see Create a committed capacity PCU group.

    • Target database is a Serverless (Non-Vector) database

    • Target database is a Serverless (Vector) database

    • Reserved capacity: One or more, depending on the PCU group’s normal, long-term workload requirements.

      This is the amount of long-term capacity that you want the group to have after the migration is complete.

    • Minimum capacity: Equal to or greater than the reserved capacity.

      If the minimum is greater than the reserved capacity, the surplus capacity is prepared in advance, and there is no autoscaling required to access that capacity.

    • Maximum capacity: Greater than the minimum by several units to allow autoscaling during resource intensive stages of the migration.

      For non-trivial migrations, consider setting the maximum to 10. For extremely large migrations, contact your DataStax account representative or DataStax Support to request more than 10 units to support your migration.

      After the migration, you can reduce the minimum and maximum capacity down to the levels required for normal database operations.

    By default, Serverless (Vector) databases can have no more than one unit per PCU group. For any non-trivial migration, contact your DataStax account representative or DataStax Support for assistance configuring a PCU group for your target Serverless (Vector) database.

Origin cluster requirements

The following requirements, recommendations, and limitations apply to origin clusters. Review all of these to ensure that your cluster is compatible with Astra DB Sideloader.

Cluster infrastructure

  • Your origin cluster can be hosted on premises or on any cloud provider.

  • Your origin cluster must run a supported database version:

    • Apache Cassandra® 3.11 or later

    • DSE 5.1 or later

    • HCD 1.1 or later

  • Your origin cluster must use the default partitioner, Murmur3Partitioner.

    Older partitioners, such as RandomPartitioner, ByteOrderedPartitioner, and OrderPreservingPartitioner, are not supported.

Cloud provider CLI

To upload snapshots directly from the origin cluster, you must install your cloud provider’s CLI on each node in the origin cluster.

The tool you install depends on the region where your target Astra DB database is deployed:

Alternatively, you can upload copies of the snapshots from a separate staging server that has the CLI installed, and you must coordinate this through the administration server. However, this process isn’t covered in this guide. The CLI commands in this guide assume you have installed your cloud provider’s CLI on the nodes in the origin cluster. If you choose the alternative option, you must modify the commands accordingly for your environment.

Incompatible data

  • Astra DB doesn’t support materialized views: You must replace these with SAI or an alternative data model design.

  • Astra DB Sideloader doesn’t support encrypted data: If your origin cluster uses DSE Transparent Data Encryption, be aware that Astra DB Sideloader can’t migrate these SSTables.

    If you have a mix of encrypted and unencrypted data, you can use Astra DB Sideloader to migrate the unencrypted data. After the initial migration, you can use another strategy to move the encrypted data, such as Cassandra Data Migrator (CDM) or a manual export and reupload.

  • Astra DB Sideloader doesn’t support secondary indexes: If you don’t remove or replace these in your origin cluster, then you must manually remove these directories from your snapshots, as explained in Create snapshots.

Administration server requirements

You need a server where you can run the Astra DB Sideloader commands.

Your administration server must have SSH access to each node in your origin cluster.

DataStax recommends that you install the following additional software on your administration server:

  • Cassandra Data Migrator (CDM) to validate imported data and, in the context of Zero Downtime Migration, reconcile it with the origin cluster.

  • jq to format JSON responses from the Astra DevOps API. The DevOps API commands in this guide use this tool.

Additional preparation for specific migration scenarios

The following information can help you prepare for specific migration scenarios, including multi-region migrations and multiple migrations to the same database.

Multi-region migrations

Multi-region migrations can include one or more of the following scenarios:

  • Your origin cluster is deployed to multiple regions.

  • Your target database is, or will be, deployed to multiple regions.

  • You need to support multiple regions in a live migration scenario.

It is difficult to provide a one-size-fits-all solution for multi-region migrations due to the potential complexity and variability of these scenarios. For assistance planning a multi-region migration, contact your DataStax account representative or DataStax Support.

Multi-node migrations

You can migrate data from any number of nodes in your origin cluster to the same target database or multiple target databases.

When you migrate data with Astra DB Sideloader, there is no difference in the core process when migrating from one node or multiple nodes. The following steps summarize the process and outline some considerations for migrating multiple nodes.

  • Migrate multiple nodes to one database

  • Migrate multiple nodes to multiple databases

  1. On your origin cluster, make sure your data is valid and ready to migrate, as explained in Origin cluster requirements.

  2. From your origin cluster, create snapshots for all of the nodes that you want to migrate.

    Run nodetool snapshot as many times as necessary to capture all of your nodes.

  3. On your target database, replicate the schemas for all tables that you want to migrate.

    This is critical for a successful migration. If the schemas don’t match, the migration fails.

    You don’t need to make any changes based on the number of nodes, as long as the keyspaces and table schemas are replicated in the target database.

  4. Initialize the migration to prompt Astra DB Sideloader to create a migration bucket for your target database.

  5. Upload all of your node snapshots to the migration bucket.

  6. Use Astra DB Sideloader to import the data to your target database.

    Astra DB Sideloader imports snapshots from the migration bucket to your target database based on the matching schemas. The number of node snapshots that you uploaded to the migration bucket doesn’t determine the success of the import. The success of the import depends primarily on the validity of the schemas and the data in the snapshots.

  7. After the import, validate the migrated data to ensure that it matches the data in the origin cluster. For example, you can run Cassandra Data Migrator (CDM) in validation mode.

Orchestrating concurrent migrations from multiple nodes to multiple target databases can be complex.

Consider focusing on one target database at a time, or create a migration plan to track origin nodes, target databases, migration bucket credentials, and timelines for each migration.

  1. On your origin cluster, make sure your data is valid and ready to migrate, as explained in Origin cluster requirements.

  2. From your origin cluster, create snapshots for all of the nodes that you want to migrate.

    Run nodetool snapshot as many times as necessary to capture all of your nodes.

  3. On each of your target databases, replicate the schemas for the tables that you want to migrate to each database.

    This is critical for a successful migration. If the schemas don’t match, the migration fails.

    You don’t need to make any changes based on the number of nodes, as long as the keyspaces and table schemas are replicated in the target databases.

    If you want to migrate the same data to multiple databases, you must recreate the schemas in each of those databases. Astra DB Sideloader requires a schema to be present in the target database in order to migrate data.

  4. For each target database, initialize a migration to prompt Astra DB Sideloader to create migration buckets for each database.

    At minimum, you must initialize one migration for each database.

  5. Upload the node snapshots to their corresponding migration buckets.

  6. Use Astra DB Sideloader to import the data to your target databases.

    You can import data to multiple databases at once, but each import event must be triggered separately using the unique migration ID.

    Astra DB Sideloader imports snapshots from the migration bucket to your target database based on the matching schemas. The number of node snapshots that you uploaded to the migration bucket doesn’t determine the success of the import. The success of the import depends primarily on the validity of the schemas and the data in the snapshots.\

  7. After the import, validate the migrated data to ensure that it matches the data in the origin cluster. For example, you can run Cassandra Data Migrator (CDM) in validation mode.

Multiple migrations to the same database

When you initialize a migration with Astra DB Sideloader, a unique migration ID is generated for that specific migration workflow. For each migration ID, there is a unique migration directory and migration directory credentials.

If you initialize multiple migrations for the same database, you generate multiple migration IDs, each with its own migration directory and credentials.

This can be useful for breaking large migrations into smaller batches. For example, if you have 100 snapshots, you could initialize 10 migrations, and then upload 10 different snapshots to each migration directory.

You can upload snapshots to multiple migration directories at once. However, when you reach the import phase of the migration, Astra DB Sideloader can import from only one migration directory at a time per database. For example, if you have 10 migration IDs for the same database, you must run 10 separate import actions. Each import must completely finish before starting the next import.

After all of the imports are complete, validate the migrated data in your target database to ensure that it matches the data in the origin cluster. For example, you can run Cassandra Data Migrator (CDM) in validation mode.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com