QuickStart Indexing
Index graph schema.
About this task
Indexing is an important subject in DSG, because indexes are a necessary component for successful completion of most graph queries in production environments.
The dev
mode can be used to bypass the need for indexes during development, but it is important to familiarize yourself with indexes.
Indexes can be created:
-
manually as materialized view indexes, secondary indexes, or search indexes
-
using the index analyzer
indexFor()
on desired graph queries -
for the specialized case of bidirectional indexing, using
inverse()
Materialized view and secondary indexes are two types of indexes that use Cassandra built-in indexing.
Materialized views are good for queries that do not require predicate-based searches.
Secondary indexes allow indexing of properties stored in collections.
Search indexes use DSE Search which is Solr-based.
Only one search index per vertex label is allowed, but multiple properties can be included.
Note that indexes are manually added with create()
for both vertex labels and edge labels.
See Creating index schema for complete examples of each type of index.
As with all queries in Graph, if you are using Gremlin console, alias the graph traversal g to a graph with |
Procedure
-
Materialized view indexes
-
Discover a required materalized view index for the vertex label
person
using the index analyzer stepsindexFor()
andanalyze()
:schema.indexFor(g.V().has('person', 'name', 'Julia CHILD')).analyze()
==>Traversal requires that the following indexes are created: schema.vertexLabel('person').materializedView('person_by_name').ifNotExists().partitionBy('name').clusterBy('person_id', Asc).create()
-
Create the required materalized view index for the vertex label
person
using the index analyzer stepsindexFor()
andapply()
:schema.indexFor(g.V().has('person', 'name', 'Julia CHILD')).apply()
Note that the only change is switching
apply()
foranalyze()
from the last step.==>Creating the following indexes: schema.vertexLabel('person').materializedView('person_by_name').ifNotExists().partitionBy('name').clusterBy('person_id', Asc).create() OK
-
Materialized view indexes for vertex labels can also be made manually:
schema.vertexLabel('meal'). materializedView('meal_by_type'). ifNotExists(). partitionBy('type'). waitForIndex(). create() schema.vertexLabel('ingredient'). materializedView('ingredient_by_name'). ifNotExists(). partitionBy('name'). create() schema.vertexLabel('location'). materializedView('location_by_name'). ifNotExists(). partitionBy('name'). clusterBy('loc_id', Asc). create() schema.vertexLabel('meal_item'). materializedView('meal_item_by_name'). ifNotExists(). partitionBy('name'). clusterBy('item_id', Asc). create() schema.vertexLabel('recipe'). materializedView('recipe_by_name'). ifNotExists(). partitionBy('name'). clusterBy('recipe_id', Asc). create()
-
Discover a required materalized view index for the edge label
person->reviewed->recipe
based on review star ratingsstars
using the index analyzer stepsindexFor()
andanalyze()
:schema.indexFor(g.V().hasLabel('person').outE('reviewed').has('stars', 5)).analyze()
==>Traversal requires that the following indexes are created: schema.edgeLabel('reviewed'). from('person').to('recipe'). materializedView('person__reviewed__recipe_by_person_person_id_stars'). ifNotExists(). partitionBy(OUT, 'person_id'). partitionBy('stars'). clusterBy(IN, 'recipe_id', Asc). create()
-
Create the required materalized view index for the edge label
person->reviewed->recipe
by applying the index analyzer:schema.indexFor(g.V().hasLabel('person').outE('reviewed').has('stars', 5)).apply()
==>Creating the following indexes: schema.edgeLabel('reviewed').from('person').to('recipe').materializedView('person__reviewed__recipe_by_person_person_id_stars').ifNotExists().partitionBy(OUT, 'person_id').partitionBy('stars').clusterBy(IN, 'recipe_id', Asc).create() OK
-
Materialized view indexes for edge labels can be made manually:
schema.edgeLabel('reviewed'). from('person').to('recipe'). materializedView('person__reviewed__recipe_by_person_person_id_year'). ifNotExists(). partitionBy(OUT, 'person_id'). clusterBy('year', Asc). clusterBy(IN, 'recipe_id', Asc). create()
In this case, the index is created to discover recipe reviews that occur before or after a particular date.
-
Secondary indexes
-
Discover a required secondary index for the vertex label
recipe
based on the cuisines stored in a collection of the recipe using the index analyzer stepsindexFor()
andanalyze()
:schema.indexFor(g.V().has('recipe', 'cuisine', contains('French')).values('name')).analyze()
==>Traversal requires that the following indexes are created: schema.vertexLabel('recipe').secondaryIndex('recipe_2i_by_cuisine').ifNotExists().by('cuisine').indexValues().create()
-
Create the required secondary index for the vertex label
recipe
by applying the index analyzer:schema.indexFor(g.V().has('recipe', 'cuisine', contains('French')).values('name')).apply()
==>Creating the following indexes: schema.vertexLabel('recipe').secondaryIndex('recipe_2i_by_cuisine').ifNotExists().by('cuisine').indexValues().create() OK
-
Secondary indexes for vertex labels can be made manually:
schema.vertexLabel('person'). secondaryIndex('person_2i_by_nickname'). ifNotExists(). by('nickname'). indexValues(). create() schema.vertexLabel('person'). secondaryIndex('person_2i_by_country'). ifNotExists(). by('country'). indexValues(). create()
-
Search indexes
-
Discover a search index for the vertex label
recipe
that requires a tokenized search of the propertyinstructions
using the index analyzer stepsindexFor()
andanalyze()
:schema.indexFor(g.V().has('recipe', 'instructions', token('Saute'))).analyze()
==>Traversal requires that the following indexes are created: schema.vertexLabel('recipe').searchIndex().ifNotExists().by('instructions').create()
-
Create the required search index for the edge label
recipe
by applying the index analyzer:schema.indexFor(g.V().has('recipe', 'instructions', token('Saute'))).apply()
==>Creating the following indexes: schema.vertexLabel('recipe').searchIndex().ifNotExists().by('instructions').create() OK
-
Search indexes for vertex labels and edge labels can be made manually:
schema.vertexLabel('recipe'). searchIndex(). ifNotExists(). by('instructions').asText(). by('name'). by('cuisine'). waitForIndex(30). create() // schema.indexFor(g.V().has('book', 'publish_year', neq(1960))).analyze() // schema.indexFor(g.V().has('book', 'publish_year', eq(1961))).analyze() schema.vertexLabel('book'). searchIndex(). ifNotExists(). by('name'). by('publish_year'). create() schema.vertexLabel('store'). searchIndex(). ifNotExists(). by('name'). create() schema.vertexLabel('home'). searchIndex(). ifNotExists(). by('name'). create() schema.vertexLabel('fridge_sensor'). searchIndex(). ifNotExists(). by('city_id'). by('sensor_id'). by('name'). create()
-
Geospatial search indexes
-
Discover a required secondary index for the edge label
person->reviewed->recipe
based on review star ratingsstars
using the index analyzer stepsindexFor()
andanalyze()
:schema.indexFor(g.V().hasLabel('location').has('geo_point', Geo.inside(Geo.point(-110,30),20, Geo.Unit.DEGREES)).values('name')).analyze()
==>Traversal requires that the following indexes are created: schema.vertexLabel('location').searchIndex().ifNotExists().by('geo_point').create()
-
Create the required materalized view index for the edge label
person->reviewed->recipe
by applying the index analyzer:schema.indexFor(g.V().hasLabel('location').has('geo_point', Geo.inside(Geo.point(-110,30),20, Geo.Unit.DEGREES)).values('name')).apply()
==>Creating the following indexes: schema.vertexLabel('location').searchIndex().ifNotExists().by('geo_point').create() OK
As with the other index types, geospatial search indexes can be created manually.
-
inverse() edge indexes
-
Create a required
inverse()
edge index for the edge labelperson->created->recipe
:schema.edgeLabel('created'). from('person').to('recipe'). materializedView('person_recipe'). ifNotExists(). inverse(). create()
==> OK