About Astra DB Sideloader

Astra DB Sideloader is a service running in Astra DB that directly imports data from snapshot backups that you’ve uploaded to Astra DB from an existing Apache Cassandra®, DataStax Enterprise (DSE), or Hyper-Converged Database (HCD) cluster.

Because it imports data directly, Astra DB Sideloader can offer several advantages over CQL-based tools like DataStax Bulk Loader (DSBulk) and Cassandra Data Migrator (CDM), including faster, more cost-effective data loading, and minimal performance impacts on your origin cluster and target database.

Astra DB Sideloader concepts

Origin, origin cluster

In the context of Astra DB Sideloader, this refers to your existing Cassandra, DSE, or HCD cluster.

Target, target database

In the context of Astra DB Sideloader, this refers to the Astra DB Serverless database where you will migrate your data.

Administration server

A server where you run the migration commands, including CLI commands and Astra DevOps API calls. It must have SSH access to each node in your origin cluster.

Migration

A workflow that you initiate within Astra DB Sideloader that encompasses the lifecycle of uploading and importing snapshot backups of a specific set of keyspaces or CQL tables.

This process produces artifacts and parameters including migration buckets, migration IDs, migration directories, and upload credentials. You use these components throughout the migration workflow.

The Astra DB Sideloader process

Transferring data with Astra DB Sideloader is a multi-phase process. Before you use Astra DB Sideloader, learn about the events, outcomes, warnings, and requirements of each phase:

Prepare your infrastructure

There are requirements for using Astra DB Sideloader that you must consider before you start a migration. Additionally, you must take steps to prepare your target database, origin cluster, and administration server before you begin the migration.

For more information, see Prepare to use Astra DB Sideloader.

Create snapshot backups

Astra DB Sideloader uses snapshot backup files to import SSTable data from your existing origin cluster. This is an ideal approach for database migrations because creating a snapshot has negligible performance impact on the origin cluster, and it preserves metadata like write timestamps and expiration times (TTLs).

Each snapshot for each node in the origin cluster must include all the keyspaces and individual CQL tables that you want to migrate.

Prepare the target database

Because snapshots don’t store schema definitions, you must pre-configure the schema definition in your target Astra DB database so that it matches the origin cluster’s schema.

For the migration to succeed, the schema in your target database must align with the schema in the origin cluster. However, you might need to modify your schema or data model to be compatible with Astra DB.

For specific requirements and more information, see Migrate data with Astra DB Sideloader: Configure the target database.

Initialize a migration

After you create snapshots on the origin cluster and pre-configure the schema on the target database, use the Astra DevOps API to initialize the migration.

data importer workflow

When you initialize a migration, Astra DB Sideloader does the following:

  1. Creates a secure migration bucket.

    The migration bucket is only created during the first initialization. All subsequent migrations use different directories in the same migration bucket.

    DataStax owns the migration bucket, and it is located within the Astra perimeter.

  2. Generates a migration ID that is unique to the new migration.

  3. Creates a migration directory within the migration bucket that is unique to the new migration.

    The migration directory is also referred to as the uploadBucketDir. In the next phase of the migration process, you will upload your snapshots to this migration directory.

  4. Generates upload credentials that grant read/write access to the migration directory.

    The credentials are formatted according to the cloud provider where your target database is deployed.

For instructions and more information, see Migrate data with Astra DB Sideloader: Initialize the migration.

Upload snapshots

When initialization is complete, use your cloud provider’s CLI to upload your snapshots to the migration directory.

To upload snapshots directly from the origin cluster, you must install your cloud provider’s CLI on each node in the origin cluster. While it is possible to orchestrate this process through a staging server, the commands given in this documentation assume you are uploading snapshots directly from the origin cluster.

The time required to upload the snapshots depends on the size of your dataset and the network throughput between the origin cluster and the migration bucket:

Speed Migration type Description

Fastest

Inter-datacenter

All else equal, snapshots take the least time to upload when the origin cluster is in the same cloud provider and region as the target database.

Fast

Cross-datacenter, co-located

Uploads are slower by default when they must exit the local datacenter. The delay increases relative to the physical distance between the datacenters.

For example, all else equal, uploading from AWS us-east-1 (Dulles, VA, USA) to AWS ca-central-1 (Montréal, QC, Canada) is faster than uploading from us-east-1 to us-west-2 (The Dalles, OR, USA) because Oregon is significantly further from Virginia than Montréal.

Variable

Cross-provider, co-located

If the target database is in a different cloud provider than the origin cluster, the upload can be slower as the data passes from one provider’s infrastructure to another.

This is considered a cross-datacenter transfer, and the delay increases relative to the physical distance between the datacenters.

Slowest

Transoceanic

The slowest uploads happen when the data must travel over transoceanic cables. If the data must also change cloud providers, there can be additional delays.

In this case, consider creating your target database in a co-located datacenter, and then deploy your database to other regions after the migration.

Import data

After uploading the snapshots to the migration directory, use the DevOps API to start the data import process.

During the import process, Astra DB Sideloader does the following:

  1. Revokes access to the migration directory.

    You cannot read or write to the migration directory after starting the data import process.

  2. Discovers all uploaded SSTables in the migration directory, and then groups them into approximately same-sized subsets.

  3. Runs validation checks on each subset.

  4. Converts all SSTables of each subset.

  5. Disables new compactions on the target database.

    This is the last point at which you can abort the migration.

    Once Astra DB Sideloader begins to import SSTable metadata (the next step), you cannot stop the migration.

  6. Imports metadata from each SSTable.

    If the dataset contains tombstones, any read operations on the target database can return inconsistent results during this step. Since compaction is disabled, there is no risk of permanent inconsistencies. However, in the context of Zero Downtime Migration, it’s important that the ZDM proxy continues to read from the origin cluster.

  7. Re-enables compactions on the Astra DB Serverless database.

Each step must finish successfully. If one step fails, the import operation stops and no data is imported into your target database.

If all steps finish successfully, the migration is complete and you can access the imported data in your target database.

For instructions and more information, see Migrate data with Astra DB Sideloader: Import data

Validate imported data

After the migration is complete, you can query the migrated data using the CQL shell or Data API.

You can run Cassandra Data Migrator (CDM) in validation mode for more thorough validation. CDM also offers an AutoCorrect mode to reconcile any differences that it detects.

Use Astra DB Sideloader with ZDM

If you need to migrate a live database, you can use Astra DB Sideloader instead of DSBulk or Cassandra Data Migrator during of Phase 2 of Zero Downtime Migration (ZDM).

data importer zdm
Use Astra DB Sideloader in the context of Zero Downtime Migration.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com