DataStax Bulk Loader
Use DataStax Bulk Loader (dsbulk) to load and unload data in CSV or JSON format with your DataStax Astra database efficiently and reliably.
You can use dsbulk
as a standalone tool to remotely connect to a cluster.
The tool is not required to run locally on an instances, but can be used in this configuration.
The |
Prerequisites
-
Download
dsbulk
. -
Unpack the distribution to your machine:
tar -xzfv dsbulk-1.8.0.tar.gz
-
Get your Client ID and Client Secret by creating your application token.
-
Connect
dsbulk
to your Astra database by including the path to the secure connect bundle, and the Client ID and Client Secret. Use the-b
option to specify the location of the secure connect bundle. The specified location must be a path on the local filesystem or a valid URL.
If a secure connect bundle is specified, any of the following options are ignored and a warning is logged:
See the --driver.basic.cloud.secure-connect-bundle, window="_blank" parameter for more information. |
Loading data
Load CSV or JSON data with a dsbulk load
command.
Load data from a local file
Load data from a local file export.csv
with headers into keyspace ks1
and table table1
:
dsbulk load -url export.csv -k ks1 -t table1 -b "path/to/secure-connect-database_name.zip" -u database_user -p database_password -header true
Specify an external data source
dsbulk load -url https://svr/data/export.csv -k ks1 -t table1 -b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
Specify a file with URLs
Specify a file that contains a list of multiple, well-formed URLs for the CSV or JSON data files to load:
dsbulk load --connector.json.urlfile "my/local/multiple-input-data-urls.txt" -k ks1 -t table1 -b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
Load CSV data from stdin
Load CSV data from stdin
as it is generated from a loading script generate_data
.
The data is loaded to the keyspace ks1
and table table1
.
If not specified, the field names are read from a header row in the input file.
generate_data | dsbulk load -url stdin:/ -k ks1 -t table1 -b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
Unloading data
Use the dsbulk unload
command to unload data from the specified keyspace and table to a CSV or JSON file.
Unload data example
Specify the keyspace ks1
and table table1
from which to unload the data to a CSV file:
dsbulk unload -url myData.csv -k ks1 -t table1 -b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
The -url
value can designate a path on the local filesystem or a valid URL.