Importing graphs

Import a graph to DataStax Enterprise.

Use TinkerPop's IoStep to import a graph in any format supported by Spark.

The following formats are auto-detected:

  • JSON: .json
  • Parquet: .parquet
  • Comma separated value (CSV): .csv
  • ORC: .orc

The format is set by the URL of the resource passed to the io method.

You can explicitly set the format to any format supported by Spark using the with("format", "format extension") method. Pass any additional format options using the with method.

Procedure

  1. Start the Spark shell.
    dse spark
  2. Import the graph in the Spark shell.

    If you exported the graph as described in Exporting graphs to DSEFS, import it in the Spark shell.

    Import the edges and vertices of a graph in JSON format.

    val g = spark.dseGraph("gods_import")
    g.io('dsefs:///tmp/data.json').read

    Import the edges and vertices separately.

    val g = spark.dseGraph("gods_import")
    g.io('dsefs:///tmp/data.json').with("vertices").read().iterate()
    g.io('dsefs:///tmp/data.json').with("edges").read().iterate()

    Import a graph from data in CSV format with a header line from an external URL and explicitly setting the format. Set the labels of the vertices and edges by specifying the column name in the CSV file.

    val g = spark.dseGraph("gods_import")
    val url = URL of CSV file
    g.io(url).with("format", "csv").with("outVertexLabel", "god").with("edgeLabel", "self").with("inVertexLabel", "god").with("header").with("nullValue", "null").read()
    g.io(url).with("format", "csv").with("vertexLabel", "god").with("header").with("nullValue", "null").read()