Unloading data examples

Examples of unloading data using dsbulk.

Unloading is the process of using the dsbulk unload command to extract data from a database table into a CSV or JSON file. Many dsbulk options used in loading and unloading are the same.
Tip: The dsbulk command examples often show a parameter such as -url filename.csv or -url filename.json. Optionally, you can load or unload data from/to compressed CSV or JSON files. For details, refer to the --connector.(csv|json).compression option.

Databases supported by DataStax Bulk Loader

DataStax Bulk Loader for Apache Cassandra® supports the use of the dsbulk load, dsbulk unload, and dsbulk count commands with:
  • DataStax Astra cloud databases
  • DataStax Enterprise (DSE) 4.7 and later databases
  • Open source Apache Cassandra® 2.1 and later databases

dsbulk unload examples

Unload data to stdout from the ks1.table1 table in a cluster with a localhost contact point. Column names in the table map to field names in the data. Field names must be emitted in a header row in the output:
dsbulk unload -k ks1 -t table1
Unload data to stdout from the ks1.table1 table and gzip the result:
dsbulk unload -k ks1 -t table1 | gzip > table1.gz
Unload data to a local directory (which may not yet exist):
dsbulk unload -url ~/data-export -k ks1 -t table1
Unload data on a cluster with authentication and SSL enabled:
dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7a"]' -u myaccount -p mypassword 
      --driver.auth.provider DsePlainTextAuthProvider 
      --driver.ssl.provider JDK 
      --driver.ssl.keystore.path /etc/dse/keystores/client.keystore 
      --driver.ssl.keystore.password mysslkspass 
      --driver.ssl.truststore.path /etc/dse/keystores/client.truststore 
      --driver.ssl.truststore.password myssltrustpass 
      -k mykeyspace -t mytable -url mytable_backup

For more details, refer to SSL options.

Unload and count data from a materialized view named health_data_mv to a local directory:
dsbulk unload count -k healthcare -t health_data_mv -url ~/export-mv
Note: In DataStax Bulk Loader 1.3.2 and later, you can use dsbulk unload count and specify a materialized view on the -t or --table option. Previously, you could only use dsbulk unload count with tables. For related information, refer to Creating a materialized view.
Unload data using a search index:
dsbulk unload -query 
  "SELECT id, petal_length, petal_width, sepal_length, sepal_width, species 
    FROM dsbulkblog.iris_with_id 
    WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'"
 --executor.continuousPaging.enabled false
where a search index exists for the table iris_with_id. Continuous Paging must be disabled because DSE Search does not work with that feature. The solr_query requires double-escaping of the double-quotes in the solr_query string.
Restriction: Faceted queries are not supported.
Unload data using a CQL query using function calls:
dsbulk unload -query 
  "SELECT id, species, writetime(species) AS TIMESTAMP, ttl(species) AS TTL FROM dsbulkblog.iris_with_id"