Loading data

How to load data using DSE Graph Loader.

DSE Graph Loader can load data from many different input data formats. Pick the option that most resembles your data source:
Type Description Instructions
CSV Strict format, with the first line of the file identifying the property keys used in the graph. Loading CSV data
Text Delimited text data of any format. Loading TEXT data
Text with regular expressions Delimited text data parsed using regular expressions (regex). Loading TEXT data using regular expressions (regex)
JSON Data stored in JSON (JavaScript Object Notation) format. Loading JSON data
JDBC-compatible database Data stored in a JDBC-compatible database. Loading data from a JDBC compatible database
HDFS file Data file stored in a Hadoop Distributed File System (HDFS) of any format. Loading data from Hadoop (HDFS)
AWS S3 file Data file stored in AWS S3 storage of any format. Loading data from AWS S3
Gryo Data stored in a binary Gryo format. Loading Gryo data
GraphSON Data stored in GraphSON format. Loading GraphSON data
GraphML Data stored in GraphML format. Loading GraphML data
Note: Fields that contain NULL, null, or empty fields in text and CSV files will be pruned by DSE Graph Loader. A transform must be used if a different behavior is desired.
Warning: When loading custom vertex ids, the vertex cache that DSE Graph Loaders uses will be bypassed to facilitate faster write throughput. The client must ensure vertices are unique because no logic will validate the existence of a vertex with custom ids. To ensure the fastest performance, the DSE Graph configuration option external_vertex_verify should be set to false.

The DSE Graph Loader also supports loading several files of the same format from a single directory. Example mapping scripts are shown for CSV and JSON, but will work for all formats.