Exporting graphs to DSEFS

Export the graph to any format supported by Spark.

Use TinkerPop IoStep to export the graph to any format supported by Spark.

The following formats are auto-detected:

  • JSON: .json
  • Parquet: .parquet
  • Comma separated value (CSV): .csv
  • ORC: .orc

The format is set by the URL of the resource passed to the io method.

You can explicitly set the format to any format supported by Spark using the with("format", "format extension") method. Pass any additional format options using the with method.

Procedure

  1. Start the Spark shell.
    dse spark
  2. Export the vertices and edges to a Spark-supported using the write method.

    Export the graph to JSON files in the DSEFS file system.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/data.json').write

    This will create a tmp/data.json/ directory in DSEFS, with two subdirectories, vertices and edges containing multiple JSON files.

    Export the edges and vertices separately.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/data.json').with("vertices").write().iterate()
    g.io('dsefs:///tmp/data.json').with("edges").write().iterate()

    Export the edges and vertices in CSV format by explicitly setting the file format.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/vertices.csv').with("vertices").with("format", "csv").write().iterate()
    g.io('dsefs:///tmp/edges.csv').with("edges").with("format", "csv").write().iterate()

What's next

Import the graphs.