DataStax Bulk Loader (dsbulk) Examples

Procedure

  • Load data to person table from a CSV file with a pipe delimiter and allow missing field values with the CSV data file located in data/vertices/person.csv containing:

    person_id|name|gender|nickname|cal_goal|macro_goal|badge|country
    e7cd5752-bc0d-4157-a80f-7523add8dbcd|Julia CHILD|F|'Jay','Julia'||||[['USA', '1912-08-12', '1944-01-01'], ['Ceylon', '1944-01-01', '1945-06-01'], ['France', '1948-01-01', '1960-01-01'], ['USA','1960-01-01', '2004-08-13']]
    adb8744c-d015-4d78-918a-d7f062c59e8f|Simone BECK|F|'Simca','Simone'||||[['France', '1904-07-07', '1991-12-20']]
    888ad970-0efc-4e2c-b234-b6a71c30efb5|Fritz STREIFF|M|||||
    f092107c-0c5c-47e7-917c-c82c7fc2a249|Louisette BERTHOLIE|F|||||
    ef811281-f954-4fd6-ace0-bf67d057771a|Patricia SIMON|F|'Pat'||||
    d45c76bc-6f93-4d0e-9d9f-33298dae0524|Alice WATERS|F|||||
    7f969e16-b81e-4fcd-87c5-1911abbed132|Patricia CURTAN|F|'Pattie'||||
    01e22ca6-da10-4cf7-8903-9b7e30c25805|Kelsie KERR|F|||||
    ad58b8bd-033f-48ee-8f3b-a84f9c24e7de|Emeril LAGASSE|M|||||
    4ce9caf1-25b8-468e-a983-69bad20c017a|James BEARD|M|'Jim', 'Jimmy'||||
    65fd73f7-0e06-49af-abd6-0725dc64737d|John DOE|M|'Johnny'|1750|10,30,60|'silver':'2016-01-01', 'gold':'2017-02-01'|
    3b3e89f4-5ca2-437e-b8e8-fe7c5e580068|John SMITH|M|'Johnnie'|1800|30,20,50||
    9b564212-9544-4f85-af36-57615e927f89|Jane DOE|F|'Janie'|1500|50,15,35||
    ba6a6766-beae-4a35-965b-6f3878581702|Sharon SMITH|F||1600|30,20,50||
    a3e7ab29-2ac9-4abf-9d4a-285bf2714a9f|Betsy JONES|F||1700|10,50,30||

    and run the dsbulk command in a shell prompt:

    dsbulk load --schema.graph food_qs --schema.vertex person \
    -url _data/vertices/person.csv \
    -delim '|' \
    --schema.allowMissingFields true
  • Load data to the book table from a CSV file with a pipe delimiter, allow missing field values, and identify the field->columns with schema mapping with the CSV data file located in data/vertices/book.csv containing:

    book_id|name|publish_year|isbn
    1001|The Art of French Cooking, Vol. 1|1961|
    1002|Simca's Cuisine: 100 Classic French Recipes for Every Occasion|1972|0-394-40152-2
    1003|The French Chef Cookbook|1968|0-394-40135-2
    1004|The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution|2007|0-307-33679-4

    and run the dsbulk command:

    dsbulk load -g food_qs -v book \
    -url data/vertices/book.csv \ 
    --schema.mapping '0=book_id, 1=name, 2=publish_year 3=isbn' \
    -delim '|' \
    --schema.allowMissingFields true
  • Load data to person_authored_book table from a CSV file with a pipe delimiter, allow missing field values, and map CSV file positions to named columns in the table with the CSV data file located in data/edges/person_authored_book.csv containing:

    person_id|book_id
    bb6d7af7-f674-4de7-8b4c-c0fdcdaa5cca|1001
    adb8744c-d015-4d78-918a-d7f062c59e8f|1001
    f092107c-0c5c-47e7-917c-c82c7fc2a249|1001
    adb8744c-d015-4d78-918a-d7f062c59e8f|1002
    ef811281-f954-4fd6-ace0-bf67d057771a|1002
    bb6d7af7-f674-4de7-8b4c-c0fdcdaa5cca|1003
    d45c76bc-6f93-4d0e-9d9f-33298dae0524|1004
    7f969e16-b81e-4fcd-87c5-1911abbed132|1004
    01e22ca6-da10-4cf7-8903-9b7e30c25805|1004
    888ad970-0efc-4e2c-b234-b6a71c30efb5|1004

    and run the dsbulk command:

    dsbulk load -g food_qs -e person_authored_book \
    -url data/edges/person_authored_book.csv \
    -m '0=person_id, 1=book_id' \
    -delim '|' \
    --schema.allowMissingFields true
  • Unload the graph data for person from the food graph:

    dsbulk unload --schema.graph food --schema.vertex person -delim '|' > unload-person.csv

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com