DSE Graph Queries

Use Session.execute_graph() or Session.execute_graph_async() for executing gremlin queries in DSE Graph. The DSE driver defines three Execution Profiles suitable for graph execution:

See Getting Started and Execution Profile documentation for more detail on working with profiles.

The DSE driver executes graph queries over the Cassandra native protocol:

from dse.cluster import Cluster, GraphExecutionProfile, EXEC_PROFILE_GRAPH_DEFAULT, EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT
from dse.graph import GraphOptions

# create the default execution profile pointing at a specific graph
graph_name = 'test'
ep = GraphExecutionProfile(graph_options=GraphOptions(graph_name=graph_name))
cluster = Cluster(execution_profiles={EXEC_PROFILE_GRAPH_DEFAULT: ep})
session = cluster.connect()

# use the system execution profile (or one with no graph_options.graph_name set) when accessing the system API
session.execute_graph("system.graph(name).ifNotExists().create()", {'name': graph_name},
                      execution_profile=EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT)

# ... set dev mode or configure graph schema ...

result = session.execute_graph('g.addV("name", "John", "age", 35)')  # uses the default execution profile
vertex = result[0]
type(vertex)  # :class:`.Vertex`

session.execute_graph("system.graph(name).drop()", {'name': graph_name},
                      execution_profile=EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT)

By default (with GraphExecutionProfile.row_factory set to graph.graph_object_row_factory()), known graph result types are unpacked and returned as specialized types (Vertex, Edge). If the result is not one of these types, a graph.Result is returned, containing the graph result parsed from JSON and removed from its outer dict. The class has some accessor convenience methods for accessing top-level properties by name (type, properties above), or lists by index:

# dicts with `__getattr__` or `__getitem__`
result = session.execute_graph("[[key_str: 'value', key_int: 3]]", execution_profile=EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT)[0]  # Using system exec just because there is no graph defined
result  # dse.graph.Result({u'key_str': u'value', u'key_int': 3})
result.value  # {u'key_int': 3, u'key_str': u'value'} (dict)
result.key_str  # u'value'
result.key_int  # 3
result['key_str']  # u'value'
result['key_int']  # 3

# lists with `__getitem__`
result = session.execute_graph('[[0, 1, 2]]', execution_profile=EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT)[0]
result  # dse.graph.Result([0, 1, 2])
result.value  # [0, 1, 2] (list)
result[1]  # 1 (list[1])

You can use a different row factory by setting Session.default_graph_row_factory or passing it to Session.execute_graph(). For example, graph.single_object_row_factory() returns the JSON result string`, unparsed. graph.graph_result_row_factory() returns parsed, but unmodified results (such that all metadata is retained, unlike graph.graph_object_row_factory(), which sheds some as attributes and properties are unpacked). These results also provide convenience methods for converting to known types (as_vertex(), as_edge(), as_path()).

Vertex and Edge properties are never unpacked since their types are unknown. If you know your graph schema and want to deserialize properties, use the GraphSON1TypeDeserializer. It provides convenient methods to deserialize by types (e.g. deserialize_date, deserialize_uuid, deserialize_polygon etc.) Example:

# ...
from dse.graph import GraphSON1TypeDeserializer

row = session.execute_graph("g.V().toList()")[0]
value = row.properties['my_property_key'][0].value  # accessing the VertexProperty value
value = GraphSON1TypeDeserializer.deserialize_timestamp(value)

print value  # 2017-06-26 08:27:05
print type(value)  # <type 'datetime.datetime'>

Named parameters are passed in a dict to cluster.Session.execute_graph():

result_set = session.execute_graph('[a, b]', {'a': 1, 'b': 2}, execution_profile=EXEC_PROFILE_GRAPH_SYSTEM_DEFAULT)
[r.value for r in result_set]  # [1, 2]

The following python types can be passed as named parameters and will be serialized automatically to their graph representation:

DSE Graph

Python

boolean

bool

bigint

long, int (PY3)

int

int

smallint

int

varint

int

float

float

double

double

uuid

uuid.UUID

Decimal

Decimal

inet

str

timestamp

datetime.datetime

date

datetime.date

time

datetime.time

duration

datetime.timedelta

point

Point

linestring

LineString

polygon

Polygon

blob

bytearray, buffer (PY2), memoryview (PY3), bytes (PY3)

Example:

s.execute_graph("""
  g.addV('all_types').
  property('blob', blob_value).
  property('timestamp', timestamp_value).
  property('polygon', polygon_value).toList()
""", {
  'timestamp_value': datetime.datetime.now(),
  'blob_value': bytearray('hello world'),
  'polygon_value': Polygon(((30, 10), (40, 40), (20, 40), (10, 20), (30, 10)))
})

As with all Execution Profile parameters, graph options can be set in the cluster default (as shown in the first example) or specified per execution:

ep = session.execution_profile_clone_update(EXEC_PROFILE_GRAPH_DEFAULT,
                                            graph_options=GraphOptions(graph_name='something-else'))
session.execute_graph(statement, execution_profile=ep)