Loading data without a configuration file
The |
Load CSV or JSON data with a dsbulk load
command.
Load data from a local file
The following examples load data from a local file export.csv
with headers into keyspace ks1
and table table1
.
-
Astra DB
-
HCD, DSE, and Cassandra
To load data into a cloud-based Astra DB database, specify the path to the secure connect bundle ZIP file.
It contains the security certificates and credentials for your database.
Also specify the username and password entered when creating the database.
For information about downloading the secure connect bundle ZIP via the Astra Portal, in advance of entering the dsbulk
command, see Manage application tokens in the Astra DB documentation.
dsbulk load -url export.csv -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password -header true
This dsbulk
example shows how you can load a previously exported CSV data file into an HCD, DSE, or Cassandra database:
dsbulk load -url export.csv -k ks1 -t table1 -h '10.200.1.3, 10.200.1.4' -header true
Specify an external data source
The url
value can designate the path to a resource, such as a local file, or a web URL from which to read/write data.
-
Astra DB
-
HCD, DSE, and Cassandra
To load data into a cloud-based Astra DB database, specify the path to the secure connect bundle ZIP file.
It contains the security certificates and credentials for your database.
Also specify the username and password entered when creating the database.
For information about downloading the secure connect bundle ZIP via the Astra Portal, in advance of entering the dsbulk
command, see Manage application tokens in the Astra DB documentation.
dsbulk load -url https://svr/data/export.csv -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
For loading data into an HCD, DSE, or Cassandra database, you can indicate a port for the cluster hosts:
dsbulk load -url https://svr/data/export.csv -k ks1 -t table1 -h '10.200.1.3, 10.200.1.4' -port 9876
Specify a file with URLs
Specify a file that contains a list of multiple, well-formed URLs for the CSV or JSON data files to load.
-
Astra DB
-
HCD, DSE, and Cassandra
To load data into a cloud-based Astra DB database, specify the path to the secure connect bundle ZIP file.
It contains the security certificates and credentials for your database.
Also specify the username and password entered when creating the database.
For information about downloading the secure connect bundle ZIP via the Astra Portal, in advance of entering the dsbulk
command, see Manage application tokens in the Astra DB documentation.
dsbulk load --connector.json.urlfile "my/local/multiple-input-data-urls.txt" -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
dsbulk load --connector.json.urlfile "my/local/multiple-input-data-urls.txt" -k ks1 -t table1 -h '10.200.1.3'
Load CSV data from stdin
Load CSV data from stdin
as it is generated from a loading script generate_data.
The data is loaded to the keyspace ks1
and table table1
.
If not specified, the field names are read from a header row in the input file.
-
Astra DB
-
HCD, DSE, Cassandra
To load data into a cloud-based Astra DB database, specify the path to the secure connect bundle ZIP file.
It contains the security certificates and credentials for your database.
Also specify the username and password entered when creating the database.
For information about downloading the secure connect bundle ZIP via the Astra Portal, in advance of entering the dsbulk
command, see Manage application tokens in the Astra DB documentation.
generate_data | dsbulk load -url stdin:/ -k ks1 -t table1 \
-b "path/to/secure-connect-database_name.zip" -u database_user -p database_password
generate_data | dsbulk load -url stdin:/ -k ks1 -t table1
Load CSV data from a CSV file to a graph vertex label table
Load CSV data from person.csv
.
The data is loaded to the graph graph1
and table vertex_label1
.
This method only applies to HCD and DSE.
dsbulk load -url data/vertices/person.csv -g graph1 -v vertex_label1 \
-delim '|' -header true --schema.allowMissingFields true