Unloading data examples
Unloading is the process of using the dsbulk unload
command to extract data from a database table into a CSV or JSON file.
Many dsbulk
options used in loading and unloading are the same.
The dsbulk command examples often show a parameter such as |
Databases supported by DataStax Bulk Loader
DataStax Bulk Loader® supports the use of the dsbulk load
, dsbulk unload
, and dsbulk count
commands with:
-
Hyper-Converged Database (HCD) 1.0 databases
-
DataStax Enterprise (DSE) 5.1, 6.8, and 6.9 databases
-
Open source Apache Cassandra® 2.1 and later databases
dsbulk unload examples
Unload data to stdout
from the ks1.table1 table in a cluster with a localhost
contact point.
Column names in the table map to field names in the data.
Field names must be emitted in a header row in the output:
dsbulk unload -k ks1 -t table1
Unload data to stdout
from the ks1.table1 table and gzip the result:
dsbulk unload -k ks1 -t table1 | gzip > table1.gz
Unload data to a local directory (which may not yet exist):
dsbulk unload -url ~/data-export -k ks1 -t table1
Unload data on a cluster with authentication and SSL enabled:
dsbulk unload -h '["fe80::f861:3eff:fe1d:9d7a"]' -u myaccount -p mypassword
--driver.auth.provider DsePlainTextAuthProvider
--driver.ssl.provider JDK
--driver.ssl.keystore.path /etc/dse/keystores/client.keystore
--driver.ssl.keystore.password mysslkspass
--driver.ssl.truststore.path /etc/dse/keystores/client.truststore
--driver.ssl.truststore.password myssltrustpass
-k mykeyspace -t mytable -url mytable_backup
For more details, refer to SSL options.
Unload and count data from a materialized view named health_data_mv
to a local directory:
dsbulk unload count -k healthcare -t health_data_mv -url ~/export-mv
In DataStax Bulk Loader 1.3.2 and later, you can use |
Unload data using a search index:
dsbulk unload -query
"SELECT id, petal_length, petal_width, sepal_length, sepal_width, species
FROM dsbulkblog.iris_with_id
WHERE solr_query = '{\\\"q\\\": \\\"species:Iris-setosa\\\"}'"
--executor.continuousPaging.enabled false
where a search index exists for the table iris_with_id.
Continuous Paging must be disabled because DSE Search does not work with that feature.
The solr_query
requires double-escaping of the double-quotes in the solr_query
string.
Faceted queries are not supported. |
Unload data using a CQL query using function calls:
dsbulk unload -query
"SELECT id, species, writetime(species) AS TIMESTAMP, ttl(species) AS TTL FROM dsbulkblog.iris_with_id"