Importing graphs
Import a graph to DataStax Enterprise.
About this task
Use TinkerPop’s IoStep
to import a graph in any format supported by Spark.
The following formats are auto-detected:
-
JSON: .json
-
Parquet: .parquet
-
Comma separated value (CSV): .csv
-
ORC: .orc
The format is set by the URL of the resource passed to the io
method.
You can explicitly set the format to any format supported by Spark using the with("format", "format extension")
method.
Pass any additional format options using the with
method.
Procedure
-
Start the Spark shell.
dse spark
-
Import the graph in the Spark shell.
If you exported the graph as described in Exporting graphs to DSEFS, import it in the Spark shell.
Import the edges and vertices of a graph in JSON format.
val g = spark.dseGraph("gods_import") g.io('dsefs:///tmp/data.json').read
Import the edges and vertices separately.
val g = spark.dseGraph("gods_import") g.io('dsefs:///tmp/data.json').with("vertices").read().iterate() g.io('dsefs:///tmp/data.json').with("edges").read().iterate()
Import a graph from data in CSV format with a header line from an external URL and explicitly setting the format. Set the labels of the vertices and edges by specifying the column name in the CSV file.
val g = spark.dseGraph("gods_import") val url = URL of CSV file g.io(url).with("format", "csv").with("outVertexLabel", "god").with("edgeLabel", "self").with("inVertexLabel", "god").with("header").with("nullValue", "null").read() g.io(url).with("format", "csv").with("vertexLabel", "god").with("header").with("nullValue", "null").read()