Unload data

Unloading is the process of using the dsbulk unload command to extract data from a database table into a CSV or JSON file. The load and unload commands share many of the same options.

When unloaded to a CSV file, column names in a table map to column names in the CSV data. When unloaded to a JSON file, column names in a table map to field names in the JSON data. Field names must be emitted in a header row in the output.

Unload locally without authentication

At minimum, you must specify the keyspace (-k, --schema.keyspace) and table (-t, --schema.table) from which to unload the data to a CSV or JSON file. The minimal command unloads data to stdout from a local cluster (localhost contact point) without authentication:

dsbulk unload -k ks1 -t table1

Unload with authentication

Pass the relevant options with your unload commands if your cluster requires authentication or uses SSL encryption.

You can pass the values directly or use a configuration file.

  • DSE, HCD, and Cassandra

  • Astra DB

If your database requires authentication, provide the username and password using the -u and -p options, respectively:

dsbulk unload -k ks1 -t table1 \
-u username -p password

If the cluster is remote, set the host and contact points in the driver options.

If your cluster has both authentication and SSL enabled, pass -u, -p, and the SSL options. For example:

dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7a"]' -u myaccount -p mypassword
      --driver.auth.provider DsePlainTextAuthProvider
      --driver.ssl.provider JDK
      --driver.ssl.keystore.path /etc/dse/keystores/client.keystore
      --driver.ssl.keystore.password mysslkspass
      --driver.ssl.truststore.path /etc/dse/keystores/client.truststore
      --driver.ssl.truststore.password myssltrustpass
      -k mykeyspace -t mytable
dsbulk unload -k ks1 -t table1 \
-b "path/to/SCB.zip" -u token -p AstraCS:...

For Astra DB, the expected connection credentials are:

  • -b: Provide the path to the database’s Secure Connect Bundle (SCB) zip file. The SCB includes certificates and key files for SSL-encrypted connections as well as information about the database’s contact points.

  • -u: Set to the literal string token.

  • -p: Provide an application token. DataStax recommends using secure references to tokens, such as environment variables, rather than specifying them directly on the command line.

Set the unload destination

Use the -url option to specify the path and filename where you want to unload the data. This can be a path on the local file system or a valid remote URL. You can specify a filename only, path only, or both filename and path. The default path is export.csv in the DSBulk installation directory.

This example unloads data to a local CSV file named myData.csv. The file is created if it does not already exist. If it does exist, then it is overwritten.

dsbulk unload -url myData.csv -k ks1 -t table1

If the path includes directories that don’t exist, the command creates the necessary directories:

dsbulk unload -url ~/data-export/myData.csv -k ks1 -t table1

Unload to a compressed file

There are two ways to unload compressed files:

  • For supported file types, use the --connector.csv|json.compression option to specify the compression type to allow DSBulk to create the compressed file.

  • Pipe the unload command to your preferred compressor and filename:

    dsbulk unload -k ks1 -t table1 | gzip > table1.gz

Unload data matching a CQL query

You can use the -query option to unload only data matching a given SELECT statement.

Faceted queries are not supported.

  • Unload data using a CQL query with a WHERE clause:

    dsbulk unload -query
      "SELECT id, petal_length, petal_width, sepal_length, sepal_width, species
        FROM dsbulkblog.iris_with_id
        WHERE species='Iris-setosa' AND token(id) > :start AND token(id) <= :end"
  • Unload data using a CQL query with function calls:

    dsbulk unload -query
      "SELECT id, species, writetime(species) AS TIMESTAMP, ttl(species) AS TTL FROM dsbulkblog.iris_with_id"
  • Unload data with a secondary index query:

    If a secondary index exists for a table, you can use a SELECT statement to unload data based on a given query on the secondary index.

    The following example uses solr_query. With DSE Search, continuous paging must be disabled because DSE Search doesn’t work with that feature. The solr_query requires double-escaping of the double-quotes in the solr_query string.

    dsbulk unload -query
      "SELECT id, petal_length, petal_width, sepal_length, sepal_width, species
        FROM dsbulkblog.iris_with_id
        WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'"
     --executor.continuousPaging.enabled false

Unload a materialized view (MV)

Specify the materialized view name in the -t (--schema.table) parameter.

The following command unloads and counts data from a materialized view to a local directory:

dsbulk unload count -k healthcare -t health_data_mv -url ~/export-mv

Unload graph data from a vertex table to a CSV file

For DSE Graph tables, pass graph and vertex options with your unload command.

The following example unloads graph data from graph graph1 and table vertex_label1 to a CSV file:

dsbulk unload --schema.graph graph1 --schema.vertex vertex_label1 \
-delim '|' > unloaded_person_data.csv

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax, an IBM Company | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com