Export graphs to DSEFS
About this task
Use TinkerPop IoStep
to export the graph to any format supported by Apache Spark™.
The following formats are auto-detected:
-
JSON: .json
-
Parquet: .parquet
-
Comma separated value (CSV): .csv
-
ORC: .orc
The format is set by the URL of the resource passed to the io
method.
You can explicitly set the format to any format supported by Apache Spark using the with("format", "format extension")
method.
Pass any additional format options using the with
method.
Procedure
-
Start the Spark shell.
dse spark
-
Export the vertices and edges to a Spark-supported graph using the
write
method.Export the graph to JSON files in the DSEFS file system.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/data.json').write
This will create a tmp/data.json/ directory in DSEFS, with two subdirectories, vertices and edges containing multiple JSON files.
Export the edges and vertices separately.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/data.json').with("vertices").write().iterate() g.io('dsefs:///tmp/data.json').with("edges").write().iterate()
Export the edges and vertices in CSV format by explicitly setting the file format.
val g = spark.dseGraph("gods_export") g.io('dsefs:///tmp/vertices.csv').with("vertices").with("format", "csv").write().iterate() g.io('dsefs:///tmp/edges.csv').with("edges").with("format", "csv").write().iterate()