Unloading data examples
Examples of unloading data using dsbulk.
Unloading is the process of using the
dsbulk unload
command to extract data
from a database table into a CSV or JSON file. Many dsbulk
options used in
loading and unloading are the same. Tip: The dsbulk command examples often show a parameter such
as
-url filename.csv
or -url
filename.json
. Optionally, you can load or unload data
from/to compressed CSV or JSON files. For details, refer to the --connector.(csv|json).compression option.Databases supported by DataStax Bulk Loader
DataStax Bulk Loader for Apache Cassandra® supports the use of the
dsbulk load
, dsbulk unload
, and dsbulk
count
commands with: - DataStax Astra cloud databases
- DataStax Enterprise (DSE) 4.7 and later databases
- Open source Apache Cassandra® 2.1 and later databases
dsbulk unload examples
Unload data to
stdout
from the ks1.table1 table in a cluster with a
localhost
contact point. Column names in the table map to field names in
the data. Field names must be emitted in a header row in the output:
dsbulk unload -k ks1 -t table1
Unload data to
stdout
from the ks1.table1 table and gzip the
result:dsbulk unload -k ks1 -t table1 | gzip > table1.gz
Unload data to a local directory (which may not yet
exist):
dsbulk unload -url ~/data-export -k ks1 -t table1
Unload data on a cluster with authentication and SSL enabled:
dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7a"]' -u myaccount -p mypassword --driver.auth.provider DsePlainTextAuthProvider --driver.ssl.provider JDK --driver.ssl.keystore.path /etc/dse/keystores/client.keystore --driver.ssl.keystore.password mysslkspass --driver.ssl.truststore.path /etc/dse/keystores/client.truststore --driver.ssl.truststore.password myssltrustpass -k mykeyspace -t mytable -url mytable_backup
For more details, refer to SSL options.
Unload and count data from a materialized view named
health_data_mv
to a
local directory:
dsbulk unload count -k healthcare -t health_data_mv -url ~/export-mv
Note: In
DataStax Bulk Loader 1.3.2 and later, you can use
dsbulk unload count
and
specify a materialized view on the -t
or --table
option.
Previously, you could only use dsbulk unload count
with tables. For related
information, refer to Creating a
materialized view.Unload data using a search
index:
dsbulk unload -query "SELECT id, petal_length, petal_width, sepal_length, sepal_width, species FROM dsbulkblog.iris_with_id WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'" --executor.continuousPaging.enabled falsewhere a search index exists for the table iris_with_id. Continuous Paging must be disabled because DSE Search does not work with that feature. The
solr_query
requires
double-escaping of the double-quotes in the solr_query
string. Restriction: Faceted queries are not supported.
Unload data using a CQL query using function
calls:
dsbulk unload -query "SELECT id, species, writetime(species) AS TIMESTAMP, ttl(species) AS TTL FROM dsbulkblog.iris_with_id"