Using indexes
Using indexes for graph queries.
Indexes can be used in graph traversal queries to trim down the number of vertices or edges
that are initially fetched. Remember that a search index must be used if two or more properties
are needed, as only search indexes can meet multiple conditions. In general, the traversal step
involves a vertex or edge label and can include a property value, including collections, tuples,
and user-defined types (UDTs). In a traversal, the step following g.V()
is
generally the step in which an index will be consulted. If a mid-traversal V()
step is called, then an additional indexed step can be consulted to narrow the list of vertices
that will be traversed.
Indexing a vertex
g.V().has('person', 'name', 'Emeril LAGASSE').out('created').values('name')results in:
==>Wild Mushroom Stroganoff
==>Spicy Meatloaf
This graph traversal uses a search index for the traversal step has('person', 'name',
'Emeril LAGASSE')
identifies the vertex label and the property indexed. After finding
the initial vertex to traverse from, the outgoing created
edges are walked and
the adjacent vertices are listed by name
.
profile()
method:g.V().has('person', 'name', 'Emeril LAGASSE').out('created').values('name').profile()with the detailed information:
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
__.V().hasLabel("person").has("name","Emeril LA... 1 1 27.935 78.56
CQL statements ordered by overall duration 24.137
\_1=SELECT * FROM food.person WHERE solr_query = '{"q":"*:*", "fq":["name:Emeril\\ LAGASSE"]}' LIMIT 2147
483647 / Duration: 24 ms / Count: 1
HasStep([~label.eq(person), name.eq(Emeril LAGA... 1 1 0.399 1.12
__.out().hasLabel("created") 2 2 6.161 17.33
CQL statements ordered by overall duration 3.369
\_1=SELECT * FROM food.person__created__recipe WHERE person_person_id = ? / Duration: 2 ms / Count: 1 / I
ndex type: Table: person__created__recipe
\_2=SELECT * FROM food.recipe WHERE recipe_id = ? / Duration: 1 ms / Count: 2 / Index type: Table: recipe
PropertiesStep([name],value) 2 2 0.624 1.76
NoOpBarrierStep(2500) 2 2 0.190 0.53
ReferenceElementStep 2 2 0.248 0.70
>TOTAL - - 35.559 -
The first CQL statement uses a search index WHERE solr_query = '{"q":"*:*",
"fq":["name:Emeril\\ LAGASSE"]}'
to first find the person, then uses two CQL
statements to find the recipes that are adjacent to that particular person vertex. Finally, the
recipe names are retrieved.Indexing an edge
John DOE
wrote that have a rating of greater or
equal to 3
stars:g.V().has('person','name','John DOE').outE().has('stars', gte(3))results in:
==>e[dseg:/person-reviewed-recipe/46ad98ac-f5c9-4411-815a-f81b3b667921/2005][dseg:/person/46ad98ac-f5c9-4411-815a-f81b3b667921-reviewed->dseg:/recipe/2005]
==>e[dseg:/person-reviewed-recipe/46ad98ac-f5c9-4411-815a-f81b3b667921/2001][dseg:/person/46ad98ac-f5c9-4411-815a-f81b3b667921-reviewed->dseg:/recipe/2001]
profile()
on the query shows that a search index query was used in the
initial step, and the output shown here shows that in the second step, the
person__reviewed__recipe_by_person_person_id_stars
materialized view index was
used to cut the latency of the
query:g.V().has('person','name','John DOE').outE().has('stars', gte(3)).profile()with the detailed information:
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
__.V().hasLabel("person").has("name","John DOE") 1 1 13.626 87.00
CQL statements ordered by overall duration 12.376
\_1=SELECT * FROM food.person WHERE solr_query = '{"q":"*:*", "fq":["name:John\\ DOE"]}' LIMIT 2147483647
/ Duration: 12 ms / Count: 1
HasStep([~label.eq(person), name.eq(John DOE)]) 1 1 0.110 0.70
__.outE().has("stars",P.gte((int) 3)) 2 2 1.589 10.15
CQL statements ordered by overall duration 0.573
\_1=SELECT * FROM food.person__reviewed__recipe_by_person_person_id_stars WHERE person_person_id = ? AND
stars >= ? / Duration: < 1 ms / Count: 1 / Index type: Materialized view
HasStep([stars.gte(3)]) 2 2 0.177 1.14
ReferenceElementStep 2 2 0.158 1.01
>TOTAL - - 15.662 -
When indexing seems to be broken
g.V(). hasLabel('person').has('name', 'John DOE'). has('badge', containsValue('2016-01-01' as LocalDate)). values('name')results in:
One or more indexes are required to execute the traversal: g.V().hasLabel("person").has("name","John DOE").has("badge",containsValue(java.time.LocalDate.of(2016, 1, 1))).values("name")
Failed step: __.V().hasLabel("person").has("badge",containsValue(java.time.LocalDate.of(2016, 1, 1))).has("name","John DOE")
CQL execution: No table or view could satisfy the query 'SELECT * FROM food.person WHERE badge CONTAINS ? AND name = ?'
'schema.indexFor(<your_traversal>).analyze()' can't suggest any indexes to create as some steps in your traversal are not supported yet.
Alternatively consider using:
g.with('ignore-unindexed') to ignore unindexed traversal. Your results may be incomplete.
g.with('allow-filtering') to allow filtering. This may have performance implications.
Since search indexes cannot index map collections, this
query cannot be completed as presented. However, using a mid-traversal query can identify
whether or not John DOE
does have a badge that meets the
requirements:g.V().has('person','name', 'John DOE').as('a'). V().has('person','badge', containsValue('2016-01-01' as LocalDate)).as('b'). select('a','b'). by('name').by('badge')results in:
==>{a=John DOE, b={gold=2017-01-01, silver=2016-01-01}}
V()
step:g.V().has('person','name', 'John DOE').as('a'). V().has('person','badge', containsValue('2016-01-01' as LocalDate)).as('b'). select('a','b'). by('name').by('badge'). profile()results in:
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
__.V().hasLabel("person").has("name","John DOE") 1 1 12.464 73.33
CQL statements ordered by overall duration 10.958
\_1=SELECT * FROM food.person WHERE solr_query = '{"q":"*:*", "fq":["name:John\\ DOE"]}' LIMIT 2147483647
/ Duration: 10 ms / Count: 1
HasStep([~label.eq(person), name.eq(John DOE)])... 1 1 0.174 1.02
__.V().hasLabel("person").has("badge",containsV... 1 1 3.402 20.02
CQL statements ordered by overall duration 2.339
\_1=SELECT * FROM food.person WHERE badge CONTAINS ? / Duration: 2 ms / Count: 1 / Index type: Secondary
index
HasStep([~label.eq(person), badge.containsValue... 1 1 0.573 3.38
SelectStep(last,[a, b],[value(name), value(badg... 1 1 0.100 0.59
ReferenceElementStep 1 1 0.282 1.66
>TOTAL - - 16.997
Next steps
This page shows a few examples of using indexes for graph querying, but is not exhaustive. Search indexes, in particular, have a variety of predicates that can be used for text, geospatial, and non-text indexing.