Using indexes
Global indexes can be used in graph traversal queries for the first traversal step reached after the V()
step, and are used to trim down the number of vertices that are initially fetched.
Remember that a search index must be used if two or more properties are used for global indexing.
In general, the traversal step involves a vertex label and can include a property key and a particular property value.
In a traversal, the step following g.V()
is generally the step in which an index will be consulted.
If a mid-traversal V()
step is called, then an additional indexed step can be consulted to narrow the list of vertices that will be traversed.
Graph traversals will only use indexes if the both the vertex label and property key are specified. If both are not specified, indexing will not be used and a full graph scan for the property key can result. If full graph scan is disabled, a query will fail, as shown in this example where a property is specified, but a vertex label is not specified: |
g.V().has('name','Julia Child')
Could not find an index to answer query clause and graph.allow_scan is disabled:
((label = FridgeSensor & name WITHIN [Julia Child]) | (label = author & name WITHIN [Julia Child]) |
(label = book & name WITHIN [Julia Child]) | (label = ingredient & name WITHIN [Julia Child]) |
(label = meal & name WITHIN [Julia Child]) | (label = recipe & name WITHIN [Julia Child]) |
(label = reviewer & name WITHIN [Julia Child]))
Edge indexes and property indexes (vertex-centric indexes) can be used to narrow the query after a global index has found the starting vertex. They allow definition of the edges that will be followed or the meta-properties that will be used to further restrict the query.
-
Global index
-
The graph traversal shown uses an index to discover certain person vertices to start the query.
g.V().has('author', 'name', 'Emeril Lagasse').out('created').values('name')
Using index part oneThis graph traversal uses an index, if the index exists, because the traversal step
has('author', 'name', 'Emeril Lagasse')
identifies the vertex label and the property key indexed. After finding the initial vertex to traverse from, the outgoingcreated
edges are walked and the adjacent vertices are listed byname
. This graph traversal shows the importance of using the vertex label in combination with the property key, as two different elements, authors and recipes, use the same property keyname
.Checking for the use of indexing can be accomplished with the
profile()
method:gremlin> g.V().has('author', 'name', 'Emeril Lagasse').out('created').values('name').profile() ==>Traversal Metrics Step Count Traversers Time (ms) % Dur ============================================================================================================= DsegGraphStep([~label.=(author), name.=(Emeril ... 1 1 2.196 51.37 query-optimizer 0.199 query-setup 0.004 index-query 0.946 DsegVertexStep(OUT,[created],vertex) 2 2 0.935 21.88 query-optimizer 0.101 query-setup 0.000 vertex-query 0.282 DsegPropertiesStep([name],value) 2 2 1.030 24.11 query-optimizer 0.044 query-setup 0.005 vertex-query 0.347 vertex-query 0.639 query-setup 0.000 NoOpBarrierStep(2500) 2 2 0.113 2.64 >TOTAL - - 4.276 -
The index-query used in the first step ll
DsegGraphStep
identifies the index type as materialized. If an index was not used, index-query would be missing from the profile output. -
Edge index
-
An edge index can narrow the query, such as this one that finds all the outgoing edges for reviews that John Doe wrote that have a rating of greater or equal to 3 stars:
g.V().has('person','name','John Doe').outE().has('stars', gte(3))
Use index part twoUsing
profile()
on the query shows that a global index query was used in the initial step, and the output shown here shows that in the second step, theratedByStars
edge index was used to cut the latency of the query.The
local()
step can be used to affect how an edge index narrows a query. -
Property index
-
A property index can narrow the query, such as this one that finds the countries that Julia Child lived in, starting in the year 1961 (in this case, only one country):
g.V().has('author', 'name','Julia Child').as('author'). local(properties('country').has('startYear', 1961)).value().as('country'). select('author','country'). by('name').by().profile()
gremlin> g.V().has('author', 'name','Julia Child').as('author'). ......1> local(properties('country').has('startYear', 1961).value()).as('country'). ......2> select('author','country'). ......3> by('name').by().profile() ==>Traversal Metrics Step Count Traversers Time (ms) % Dur ============================================================================================================= DsegGraphStep(vertex,[],(label = author & name ... 1 1 1.274 37.35 query-optimizer 0.253 _condition=((label = author & name = Julia Child) & (true)) query-setup 0.008 _isFitted=true _isSorted=false _isScan=false index-query 0.557 _indexType=Materialized _usesCache=false _statement=SELECT "authorId" FROM "newComp"."author_p_byName" WHERE "name" = ? LIMIT ?; with params (jav a.lang.String) Julia Child, (java.lang.Integer) 50000 _options=Options{consistency=Optional[ONE], serialConsistency=Optional.empty, fallbackConsistency=Option al.empty, pagingState=null, pageSize=-1, user=Optional.empty, waitForSchemaAgreement=true, asyn c=true} DsegHasStep@[person] 1 1 0.060 1.76 LocalStep([DsegPropertiesStep([country],propert... 1 1 1.300 38.12 DsegPropertiesStep([country],property,(label ... 1 1 1.149 query-optimizer 0.239 _condition=((label = country & startYear = 1961) & (true)) query-setup 0.001 _isFitted=true _isSorted=false _isScan=false vertex-query 0.564 _usesCache=false _statement=SELECT * FROM "newComp"."author_p_OUT_byStartYear_p" WHERE "authorId" = ? AND "~~property_ke y_id" = ? AND "~startYear" = ? LIMIT ? ALLOW FILTERING; with params (java.lang.Integer) 1, ( java.lang.Integer) 32801, (java.lang.Integer) 1961, (java.lang.Integer) 50000 _options=Options{consistency=Optional[ONE], serialConsistency=Optional.empty, fallbackConsistency=Optio nal.empty, pagingState=null, pageSize=-1, user=Optional.empty, waitForSchemaAgreement=true, as ync=true} _usesIndex=true DsegHasStep([startYear.eq(1961)]) 1 1 0.081 PropertyValueStep 1 1 0.026 SelectStep(last,[author, country],[value(name),... 1 1 0.720 21.13 NoOpBarrierStep(2500) 1 1 0.032 0.95 DsegPropertyLoadStep 1 1 0.023 0.69 >TOTAL - - 3.411 -
Using
profile()
on the query shows that a global index query was used in the initial step, and the output shown here shows that in the secondSELECT
step, thebyStartYear
property index was used to cut the latency of the query.The
local()
step can also be handy for use with property indexes.