QuickStart Exploring traversals
Explore graph data with query traversals.
About this task
Exploring the graph with graph traversals can lead to interesting conclusions. Here we’ll explore a number of traversals, to show off the power of Gremlin in creating simple queries.
As with all queries in Graph, if you are using Gremlin console, alias the graph traversal g to a graph with |
Procedure
-
All queries can be profiled to see what the query path is and how the query performs. The
profile()
step will display information abut the length of time each portion of the command takes to run, as well as the underlying CQL command that is run to complete the Gremlin command.g.V().has('person', 'name', 'Julia CHILD').profile()
In Studio:
Clicking on the bars in the graph in Studio will show more detail about underlying CQL commands that Graph uses to execute a query.
In Gremlin console:
==>Traversal Metrics Step Count Traversers Time (ms) % Dur ============================================================================================================= __.V().has("name","Julia CHILD") 52.140 96.02 HasStep([name.eq(Julia CHILD)]) 1.987 3.66 ReferenceElementStep 0.174 0.32 >TOTAL - - 54.302 -
In all the following queries, to investigate what happens, and why some queries are more efficient than others, try adding .profile() to any query will show you information similar to the information above.
-
Graph queries will have lower latency if the query is more specific, and uses the
has()
step is for narrowing the search. Compare the following queries and their profiles (by adding.profile()
to the end:dev.V().hasLabel('person')
dev.V().hasLabel('person').has('name', 'Julia CHILD')
dev.V().has('person','name', 'Julia CHILD')
Running any of these queries in Studio will display the vertex id, label and all property values. In Gremlin console, these queries will only display the vertex id; the
elementMap()
step must be appended to get the property values. -
In this next traversal,
has()
filters vertex properties byname = Julia Child
as seen above. The traversal stepoutE()
discovers the outgoing edges from that vertex with theauthored
label.g.V().has('person','name','Julia CHILD').outE('authored')
In Studio, either the listing of the Raw JSON view edge information:
In Gremlin console:
==>e[dseg:/person-authored-book/e7cd5752-bc0d-4157-a80f-7523add8dbcd/1001][dseg:/person/e7cd5752-bc0d-4157-a80f-7523add8dbcd-authored->dseg:/book/1001] ==>e[dseg:/person-authored-book/e7cd5752-bc0d-4157-a80f-7523add8dbcd/1003][dseg:/person/e7cd5752-bc0d-4157-a80f-7523add8dbcd-authored->dseg:/book/1003]
-
If instead, you want to query for the books that all people have written, the query must be modified. The previous example retrieved edges, but not the adjacent book vertices. Add a traversal step
inV()
to find all the vertices that connect to the outgoing edges, then print the book titles of those vertices. Notice how the chained traversal steps go from the vertices along outgoing edges to the adjacent vertices withV().outE().inV()
. The outgoing edges are given a particular filter value, authored.g.V().outE('authored').inV().values('name')
In Studio: and in Gremlin console:
==>The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution ==>The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution ==>The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution ==>The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution ==>The Art of French Cooking, Vol. 1 ==>The Art of French Cooking, Vol. 1 ==>Simca's Cuisine: 100 Classic French Recipes for Every Occasion ==>Simca's Cuisine: 100 Classic French Recipes for Every Occasion ==>The French Chef Cookbook
-
Notice that the book titles are duplicated in the resulting list, because a listing is returned for each author. If a book has three authors, three listings are returned. The traversal step
dedup()
can eliminate the duplication.g.V().outE('authored').inV().values('name').dedup()
In Studio: and in Gremlin console:
==>The Art of Simple Food: Notes, Lessons, and Recipes from a Delicious Revolution ==>The Art of French Cooking, Vol. 1 ==>Simca's Cuisine: 100 Classic French Recipes for Every Occasion ==>The French Chef Cookbook
-
Refine the traversal by reinserting the
has()
step for a particular author. Find all the books authored by Julia Child.g.V().has('name','Julia CHILD').outE('authored').inV().values('name')
In Studio: and in Gremlin console:
==>The Art of French Cooking, Vol. 1 ==>The French Chef Cookbook
-
The previous example and this example accomplish the same result. However, the number of traversal steps and the type of traversal steps can affect performance. The traversal step
outE()
should be only used if the edges are explicitly required. In this example, the edges are traversed to get information about connected vertices, but the edge information is not important to the query.g.V().has('person', 'name','Julia CHILD').out('authored').values('name')
In Studio: and in Gremlin console:
==>The Art of French Cooking, Vol. 1 ==>The French Chef Cookbook
The traversal step
out()
retrieves the connected book vertices based on the edge labelauthored
without retrieving the edge information. In a larger graph traversal, this subtle difference in the traversal can become a latency issue. -
Additional traversal steps continue to fine-tune the results. Adding another chained
has
traversal step finds only books authored by Julia Child published after 1967. This example also displays the use of thegt
, or greater than function.g.V().has('person', 'name','Julia CHILD').out('authored').has('publish_year', gt(1967)).values('name', 'publish_year')
In Studio: and in Gremlin console:
==>The French Chef Cookbook ==>1968
-
When developing or testing, oftentimes checking the number of vertices with each vertex label can confirm that data was read. To find the number of vertices by vertex label, use the traversal step
label()
followed by the traversal stepgroupCount()
. The stepgroupCount()
is useful for aggregating results from a previous step. Suppress the warning by prependingg.with("label-warning", false).
instead ofg.
:g.V().label().groupCount()
In Studio:
and in Gremlin console:
==>{meal=8, meal_item=3, ingredient=31, person=15, book=4, recipe=8, fridge_sensor=9, location=16, store=3, home=3}