Loading data without a configuration file
The dsbulk command examples often show a parameter such as |
Load CSV or JSON data with a dsbulk load
command.
To load data into a cloud-based DataStax Astra DB database, specify the path to the secure connect bundle ZIP file.
It contains the security certificates and credentials for your database.
Also specify the username and password entered when creating the database.
For information about downloading the secure connect bundle ZIP via the Astra Portal, in advance of entering the |
Load data from a local file
Load data from a local file export.csv with headers into keyspace ks1 and table table1:
DataStax Astra databases
dsbulk load -url export.csv -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password -header true
HCD / DSE / open source Cassandra databases
This dsbulk
example shows how you can load a previously exported CSV data file into an HCD, DSE, or Cassandra database:
dsbulk load -url export.csv -k ks1 -t table1 -h '10.200.1.3, 10.200.1.4' -header true
The url
value can designate the path to a resource, such as a local file, or a web URL from which to read/write data.
Specify an external data source
DataStax Astra databases
dsbulk load -url https://svr/data/export.csv -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
HCD / DSE / open source Cassandra databases
For loading data into an HCD, DSE, or Cassandra database, you can indicate a port for the cluster hosts.
dsbulk load -url https://svr/data/export.csv -k ks1 -t table1 -h '10.200.1.3, 10.200.1.4' -port 9876
Specify a file with URLs
Specify a file that contains a list of multiple, well-formed URLs for the CSV or JSON data files to load:
DataStax Astra databases
dsbulk load --connector.json.urlfile "my/local/multiple-input-data-urls.txt" -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
HCD / DSE / open source Cassandra databases
dsbulk load --connector.json.urlfile "my/local/multiple-input-data-urls.txt" -k ks1 -t table1 -h '10.200.1.3'
Load CSV data from stdin
Load CSV data from stdin
as it is generated from a loading script generate_data.
The data is loaded to the keyspace ks1 and table table1.
If not specified, the field names are read from a header row in the input file.
DataStax Astra databases
generate_data | dsbulk load -url stdin:/ -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
HCD / DSE / open source Cassandra databases
generate_data | dsbulk load -url stdin:/ -k ks1 -t table1
Load CSV data from a CSV file to a graph vertex label table
Load CSV data from person.csv. The data is loaded to the graph graph1 and table vertex_label1.
HCD / DSE databases
dsbulk load -url data/vertices/person.csv -g graph1 -v vertex_label1 \
-delim '|' -header true --schema.allowMissingFields true