Exporting graphs to DSEFS

Export the graph to any format supported by Spark.

About this task

Use TinkerPop IoStep to export the graph to any format supported by Spark.

The following formats are auto-detected:

  • JSON: .json

  • Parquet: .parquet

  • Comma separated value (CSV): .csv

  • ORC: .orc

The format is set by the URL of the resource passed to the io method.

You can explicitly set the format to any format supported by Spark using the with("format", "format extension") method. Pass any additional format options using the with method.

Procedure

  1. Start the Spark shell.

    dse spark
  2. Export the vertices and edges to a Spark-supported using the write method.

    Export the graph to JSON files in the DSEFS file system.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/data.json').write

    This will create a tmp/data.json/ directory in DSEFS, with two subdirectories, vertices and edges containing multiple JSON files.

    Export the edges and vertices separately.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/data.json').with("vertices").write().iterate()
    g.io('dsefs:///tmp/data.json').with("edges").write().iterate()

    Export the edges and vertices in CSV format by explicitly setting the file format.

    val g = spark.dseGraph("gods_export")
    g.io('dsefs:///tmp/vertices.csv').with("vertices").with("format", "csv").write().iterate()
    g.io('dsefs:///tmp/edges.csv').with("edges").with("format", "csv").write().iterate()

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com