Exporting graphs to DSEFS
Export the graph to any format supported by Spark.
About this task
Use TinkerPop IoStep
to export the graph to any format supported by Spark.
The following formats are auto-detected:
-
JSON: .json
-
Parquet: .parquet
-
Comma separated value (CSV): .csv
-
ORC: .orc
The format is set by the URL of the resource passed to the io
method.
You can explicitly set the format to any format supported by Spark using the with("format", "format extension")
method.
Pass any additional format options using the with
method.
Procedure
-
Start the Spark shell.
dse spark
-
Export the vertices and edges to a Spark-supported using the
write
method.Export the graph to JSON files in the DSEFS file system.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/data.json').write
This will create a tmp/data.json/ directory in DSEFS, with two subdirectories, vertices and edges containing multiple JSON files.
Export the edges and vertices separately.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/data.json').with("vertices").write().iterate() g.io('dsefs:///tmp/data.json').with("edges").write().iterate()
Export the edges and vertices in CSV format by explicitly setting the file format.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/vertices.csv').with("vertices").with("format", "csv").write().iterate() g.io('dsefs:///tmp/edges.csv').with("edges").with("format", "csv").write().iterate()