Importing graphs using DseGraphFrame
Use DseGraphFrame
to import a graph to DataStax Enterprise.
The graph schema should be created manually in the Gremlin console or DSE Studio before importing the graph. Import only works with custom ID mapping.
-
Start the Spark shell.
$ dse spark
-
If you exported the graph to JSON using
DseGraphFrame
, import it in the Spark shell.val g = spark.dseGraph("gods_import") g.updateVertices(spark.read.json("/tmp/v.json")) g.updateEdges(spark.read.json("/tmp/e.json"))
val g = spark.dseGraph("graph name") g.updateVertices(spark.read.json("path to exported vertices JSON")) g.updateEdges(spark.read.json("path to exported edges JSON"))
-
If you have a custom graph:
-
Examine the schema of the graph and note how to map it to the expected schema of a DSE Graph schema.
This example will use the
friends
graph from the GraphFrame project.scala> import org.graphframes._ scala> val g: GraphFrame = examples.Graphs.friends scala> g.vertices.printSchema root |-- id: string (nullable = true) |-- name: string (nullable = true) |-- age: integer (nullable = false) scala> g.edges.printSchema root |-- src: string (nullable = true) |-- dst: string (nullable = true) |-- relationship: string (nullable = true)
-
In the Gremlin console or DSE Studio create the schema.
system.graph('friends').create() :remote config alias g friends.g schema.propertyKey("age").Int().create() schema.propertyKey("name").Text().create() schema.propertyKey("id").Text().single().create() schema.vertexLabel('people').partitionKey("id").properties("name", "age").create(); schema.edgeLabel("friend").create() schema.edgeLabel("follow").create()
-
In the Spark shell create an empty
DseGraphFrame
graph and check the target schemas.scala> val d = spark.dseGraph("friends") scala> d.V.printSchema root |-- id: string (nullable = false) |-- ~label: string (nullable = false) |-- _id: string (nullable = true) |-- name: string (nullable = true) |-- age: integer (nullable = true) scala> d.E.printSchema root |-- src: string (nullable = false) |-- dst: string (nullable = false) |-- ~label: string (nullable = true) |-- id: string (nullable = true)
-
Convert the edges and vertices to the target format.
scala> val v = g.vertices.select ($"id" as "_id", lit("people") as "~label", $"name", $"age") scala> val e = g.edges.select (d.idColumn(lit("people"), $"src") as "src", d.idColumn(lit("people"), $"dst") as "dst", $"relationship" as "~label")
-
Append the converted vertices and edges to the target graph.
d.updateVertices (v) d.updateEdges (e)
-