Getting Started

First, make sure you have the DSE Graph extension properly installed.

You might be interested in reading the DSE Graph Getting Started documentation to understand the basics of creating a graph and its schema.

Graph Traversal Queries

The base DSE driver provides Session.execute_graph, which allows users to execute traversal query strings. Here is a simple example:

session.execute_graph("g.addV('genre').property('genreId', 1).property('name', 'Action').next();")

Since graph queries can be very complex, working with strings is not very convenient and is hard to maintain. This module provides a Python API for specifying graph traversals with TinkerPop. These native traversal queries can be executed explicitly, with a DSE Session object, or implicitly:

g = DseGraph.traversal_source(session=dse_session)
g.addV('genre').property('genreId', 1).property('name', 'Action').next()

# implicit execution caused by iterating over results
for v in g.V().has('genre', 'name', 'Drama').in('belongsTo').valueMap():
    print(v)

These Python types are also supported transparently:

g.addV('person').property('name', 'Mike').property('birthday', datetime(1984, 3, 11)). \
    property('house_yard', Polygon(((30, 10), (40, 40), (20, 40), (10, 20), (30, 10)))

More readings about Gremlin:

Configuring a Traversal Execution Profile

The DSE Graph extension takes advantage of configuration profiles to allow different execution configurations for the various query handlers. Graph traversal execution requires a custom execution profile to enable Gremlin-bytecode as query language. Here is how to accomplish this configuration:

from dse.cluster import Cluster, EXEC_PROFILE_GRAPH_DEFAULT
from dse_graph import DseGraph

ep = DseGraph.create_execution_profile('graph_name')
cluster = Cluster(execution_profiles={EXEC_PROFILE_GRAPH_DEFAULT: ep})
session = cluster.connect()

g = DseGraph.traversal_source(session)  # Build the GraphTraversalSource
print g.V().toList()  # Traverse the Graph

Note that the execution profile created with DseGraph.create_execution_profile cannot be used for any groovy string queries.

If you want to change execution property defaults, please see the Execution Profile documentation for a more generalized discussion of the API. Graph traversal queries use the same execution profile defined for DSE graph. If you need to change the default properties, please refer to the DSE Graph query documentation page

Explicit Graph Traversal Execution with a DSE Session

Traversal queries can be executed explicitly using session.execute_graph or session.execute_graph_async. These functions return results as DSE graph types. If you are familiar with DSE queries or need async execution, you might prefer that way. Below is an example of explicit execution. For this example, assume the schema has been generated as above:

from dse_graph import DseGraph
from pprint import pprint

# create a tinkerpop graphson2 ExecutionProfile
ep = DseGraph.create_execution_profile('graph_name')
cluster = Cluster(execution_profiles={EXEC_PROFILE_GRAPH_DEFAULT: ep})
session = cluster.connect()

g = DseGraph.traversal_source(session=session)
addV_query = DseGraph.query_from_traversal(
    g.addV('genre').property('genreId', 1).property('name', 'Action')
)
v_query = DseGraph.query_from_traversal(g.V())

for result in session.execute_graph(addV_query):
    pprint(result.value)
for result in session.execute_graph(v_query):
    pprint(result.value)

Implicit Graph Traversal Execution with TinkerPop

Using the dse_graph.DseGraph class, you can build a GraphTraversalSource that will execute queries on a DSE session without explicitly passing anything to that session. We call this implicit execution because the Session is not explicitly involved. Everything is managed internally by TinkerPop while traversing the graph and the results are TinkerPop types as well.

For example:

# Build the GraphTraversalSource
g = DseGraph.traversal_source(session)
# implicitly execute the query by traversing the TraversalSource
g.addV('genre').property('genreId', 1).property('name', 'Action').next()
# view the results of the execution
pprint(g.V().toList())

Specify the Execution Profile explicitly

If you don’t want to change the default graph execution profile (EXEC_PROFILE_GRAPH_DEFAULT), you can register a new one as usual and use it explicitly. Here is an example:

from dse.cluster import Cluster
from dse_graph import DseGraph

cluster = Cluster()
ep = DseGraph.create_execution_profile('graph_name')
cluster.add_execution_profile('graph_traversal', ep)
session = cluster.connect()

g = DseGraph.traversal_source()
query = DseGraph.query_from_traversal(g.V())
session.execute_graph(query, execution_profile='graph_traversal')

You can also create multiple GraphTraversalSources and use them with the same execution profile (for different graphs):

g_movies = DseGraph.traversal_source(session, graph_name='movies', ep)
g_series = DseGraph.traversal_source(session, graph_name='series', ep)

print g_movies.V().toList()  # Traverse the movies Graph
print g_series.V().toList()  # Traverse the series Graph

Batch Queries

DSE Graph supports batch queries using a TraversalBatch object instantiated with DseGraph.batch(). A TraversalBatch allows you to execute multiple graph traversals in a single atomic transaction. A traversal batch is executed with Session.execute_graph() or using TraversalBatch.execute() if bounded to a DSE session.

Either way you choose to execute the traversal batch, you need to configure the execution profile accordingly. Here is a example:

from dse.cluster import Cluster
from dse_graph import DseGraph

ep = DseGraph.create_execution_profile('graph_name')
cluster = Cluster(execution_profiles={'graphson2': ep})
session = cluster.connect()

g = DseGraph.traversal_source()

To execute the batch using Session.execute_graph(), you need to convert the batch to a GraphStatement:

batch = DseGraph.batch()

batch.add(
    g.addV('genre').property('genreId', 1).property('name', 'Action'))
batch.add(
    g.addV('genre').property('genreId', 2).property('name', 'Drama'))  # Don't use `.next()` with a batch

graph_statement = batch.as_graph_statement()
graph_statement.is_idempotent = True  # configure any Statement parameters if needed...
session.execute_graph(graph_statement, execution_profile='graphson2')

To execute the batch using TraversalBatch.execute(), you need to bound the batch to a DSE session:

batch = DseGraph.batch(session, 'graphson2')  # bound the session and execution profile

batch.add(
    g.addV('genre').property('genreId', 1).property('name', 'Action'))
batch.add(
    g.addV('genre').property('genreId', 2).property('name', 'Drama'))  # Don't use `.next()` with a batch

batch.execute()

DSL (Domain Specific Languages)

DSL are very useful to write better domain-specific APIs and avoiding code duplication. Let’s say we have a graph of People and we produce a lot of statistics based on age. All graph traversal queries of our application would look like:

g.V().hasLabel("people").has("age", P.gt(21))...

which is not really verbose and quite annoying to repeat in a code base. Let’s create a DSL:

from gremlin_python.process.graph_traversal import GraphTraversal, GraphTraversalSource

class MyAppTraversal(GraphTraversal):

  def younger_than(self, age):
      return self.has("age", P.lt(age))

  def older_than(self, age):
      return self.has("age", P.gt(age))

class MyAppTraversalSource(GraphTraversalSource):

  def __init__(self, *args, **kwargs):
      super(MyAppTraversalSource, self).__init__(*args, **kwargs)
      self.graph_traversal = MyAppTraversal

  def people(self):
      return self.get_graph_traversal().V().hasLabel("people")

Now, we can use our DSL that is a lot cleaner:

from dse_graph import DseGraph

# ...
g = DseGraph.traversal_source(session=session, traversal_class=MyAppTraversalsource)

g.people().younger_than(21)...
g.people().older_than(30)...

To see a more complete example of DSL, see the Python killrvideo DSL app