DataStax Bulk Loader (dsbulk) Examples

Using dsbulk to load and unload DataStax Graph data.

Procedure

  • Load data to person table from a CSV file with a pipe delimiter and allow missing field values with the CSV data file located in data/vertices/person.csv containing:
    person_id|name|gender|nickname|cal_goal|macro_goal|badge|country
    e7cd5752-bc0d-4157-a80f-7523add8dbcd|Julia CHILD|F|'Jay','Julia'||||[['USA', '1912-08-12', '1944-01-01'], ['Ceylon', '1944-01-01', '1945-06-01'], ['France', '1948-01-01', '1960-01-01'], ['USA','1960-01-01', '2004-08-13']]
    adb8744c-d015-4d78-918a-d7f062c59e8f|Simone BECK|F|'Simca','Simone'||||[['France', '1904-07-07', '1991-12-20']]
    888ad970-0efc-4e2c-b234-b6a71c30efb5|Fritz STREIFF|M|||||
    f092107c-0c5c-47e7-917c-c82c7fc2a249|Louisette BERTHOLIE|F|||||
    ef811281-f954-4fd6-ace0-bf67d057771a|Patricia SIMON|F|'Pat'||||
    d45c76bc-6f93-4d0e-9d9f-33298dae0524|Alice WATERS|F|||||
    7f969e16-b81e-4fcd-87c5-1911abbed132|Patricia CURTAN|F|'Pattie'||||
    01e22ca6-da10-4cf7-8903-9b7e30c25805|Kelsie KERR|F|||||
    ad58b8bd-033f-48ee-8f3b-a84f9c24e7de|Emeril LAGASSE|M|||||
    4ce9caf1-25b8-468e-a983-69bad20c017a|James BEARD|M|'Jim', 'Jimmy'||||
    65fd73f7-0e06-49af-abd6-0725dc64737d|John DOE|M|'Johnny'|1750|10,30,60|'silver':'2016-01-01', 'gold':'2017-02-01'|
    3b3e89f4-5ca2-437e-b8e8-fe7c5e580068|John SMITH|M|'Johnnie'|1800|30,20,50||
    9b564212-9544-4f85-af36-57615e927f89|Jane DOE|F|'Janie'|1500|50,15,35||
    ba6a6766-beae-4a35-965b-6f3878581702|Sharon SMITH|F||1600|30,20,50||
    a3e7ab29-2ac9-4abf-9d4a-285bf2714a9f|Betsy JONES|F||1700|10,50,30||
    and run the dsbulk command in a shell prompt:
    dsbulk load --schema.graph food_qs --schema.vertex person \
    -url _data/vertices/person.csv \
    -delim '|' \
    --schema.allowMissingFields true
  • Load data to the book table from a CSV file with a pipe delimiter, allow missing field values, and identify the field->columns with schema mapping with the CSV data file located in data/vertices/book.csv containing:
    book_id|name|publish_year|isbn
    1001|The Art of French Cooking, Vol. 1|1961|
    1002|Simca's Cuisine: 100 Classic French Recipes for Every Occasion|1972|0-394-40152-2
    1003|The French Chef Cookbook|1968|0-394-40135-2
    1004|The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution|2007|0-307-33679-4
    and run the dsbulk command:
    dsbulk load -g food_qs -v book \
    -url data/vertices/book.csv \ 
    --schema.mapping '0=book_id, 1=name, 2=publish_year 3=isbn' \
    -delim '|' \
    --schema.allowMissingFields true
  • Load data to person_authored_book table from a CSV file with a pipe delimiter, allow missing field values, and map CSV file positions to named columns in the table with the CSV data file located in data/edges/person_authored_book.csv containing:
    person_id|book_id
    bb6d7af7-f674-4de7-8b4c-c0fdcdaa5cca|1001
    adb8744c-d015-4d78-918a-d7f062c59e8f|1001
    f092107c-0c5c-47e7-917c-c82c7fc2a249|1001
    adb8744c-d015-4d78-918a-d7f062c59e8f|1002
    ef811281-f954-4fd6-ace0-bf67d057771a|1002
    bb6d7af7-f674-4de7-8b4c-c0fdcdaa5cca|1003
    d45c76bc-6f93-4d0e-9d9f-33298dae0524|1004
    7f969e16-b81e-4fcd-87c5-1911abbed132|1004
    01e22ca6-da10-4cf7-8903-9b7e30c25805|1004
    888ad970-0efc-4e2c-b234-b6a71c30efb5|1004
    and run the dsbulk command:
    dsbulk load -g food_qs -e person_authored_book \
    -url data/edges/person_authored_book.csv \
    -m '0=person_id, 1=book_id' \
    -delim '|' \
    --schema.allowMissingFields true
  • Unload the graph data for person from the food graph:
    dsbulk unload --schema.graph food --schema.vertex person -delim '|' > unload-person.csv