Querying DSE Graph vertices and edges with Spark SQL

Spark SQL can query DSE Graph vertex and edge tables. The dse_graph database holds the vertex and edge tables for each graph. The naming format for the tables is <graph name>_vertices and <graph name>_edges. For example, if you have a graph named gods, the vertices and edges are accessible in Spark SQL in the dse_graph.gods_vertices and dse_graph.gods_edges tables.

select * from dse_graph.gods_vertices;

If you have properties that are spelled the same but with different capitalizations (for example, id and Id), start Spark SQL with the --conf spark.sql.caseSensitive=true option.

Prerequisites

Start your cluster with both Graph and Spark enabled.

Procedure

  1. Start the Spark SQL shell.

    dse spark-sql
  2. Query the vertices and edges using SELECT statements.

    USE dse_graph;
    SELECT * FROM gods_vertices where name = 'Zeus';
  3. Join the vertices and edges in a query.

    Vertices are identified by id columns. Edge tables have src and dst columns that identify the from and to vertices, respectively. A join can be used to traverse the graph. For example to find all vertex ids that are reached by the out edges:

    SELECT gods_edges.dst FROM gods_vertices JOIN gods_edges ON gods_vertices.id = gods_edges.src;

What’s next

The same steps work from the Spark shell using spark.sql() to run the query statements, or using the JDBC/ODBC driver and the Spark SQL Thrift Server.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com