QuickStart Indexing

Index graph schema.

About this task

Indexing is an important subject in DSG, because indexes are a necessary component for successful completion of most graph queries in production environments. The dev mode can be used to bypass the need for indexes during development, but it is important to familiarize yourself with indexes.

Indexes can be created:

  • manually as materialized view indexes, secondary indexes, or search indexes

  • using the index analyzer indexFor() on desired graph queries

  • for the specialized case of bidirectional indexing, using inverse()

Materialized view and secondary indexes are two types of indexes that use Cassandra built-in indexing. Materialized views are good for queries that do not require predicate-based searches. Secondary indexes allow indexing of properties stored in collections. Search indexes use DSE Search which is Solr-based. Only one search index per vertex label is allowed, but multiple properties can be included. Note that indexes are manually added with create() for both vertex labels and edge labels. See Creating index schema for complete examples of each type of index.

As with all queries in Graph, if you are using Gremlin console, alias the graph traversal g to a graph with :remote config alias g food_qs.g before running any commands.

Procedure

  1. Materialized view indexes

  2. Discover a required materalized view index for the vertex label person using the index analyzer steps indexFor() and analyze():

    schema.indexFor(g.V().has('person', 'name', 'Julia CHILD')).analyze()
    ==>Traversal requires that the following indexes are created:
    schema.vertexLabel('person').materializedView('person_by_name').ifNotExists().partitionBy('name').clusterBy('person_id', Asc).create()
  3. Create the required materalized view index for the vertex label person using the index analyzer steps indexFor() and apply():

    schema.indexFor(g.V().has('person', 'name', 'Julia CHILD')).apply()

    Note that the only change is switching apply() for analyze() from the last step.

    ==>Creating the following indexes:
    schema.vertexLabel('person').materializedView('person_by_name').ifNotExists().partitionBy('name').clusterBy('person_id', Asc).create()
    OK
  4. Materialized view indexes for vertex labels can also be made manually:

    schema.vertexLabel('meal').
      materializedView('meal_by_type').
      ifNotExists().
      partitionBy('type').
      waitForIndex().
      create()
    
    schema.vertexLabel('ingredient').
      materializedView('ingredient_by_name').
      ifNotExists().
      partitionBy('name').
      create()
    
    schema.vertexLabel('location').
      materializedView('location_by_name').
      ifNotExists().
      partitionBy('name').
      clusterBy('loc_id', Asc).
      create()
    
    schema.vertexLabel('meal_item').
      materializedView('meal_item_by_name').
      ifNotExists().
      partitionBy('name').
      clusterBy('item_id', Asc).
      create()
    
    schema.vertexLabel('recipe').
      materializedView('recipe_by_name').
      ifNotExists().
      partitionBy('name').
      clusterBy('recipe_id', Asc).
      create()
  5. Discover a required materalized view index for the edge label person->reviewed->recipe based on review star ratings stars using the index analyzer steps indexFor() and analyze():

    schema.indexFor(g.V().hasLabel('person').outE('reviewed').has('stars', 5)).analyze()
    ==>Traversal requires that the following indexes are created:
    schema.edgeLabel('reviewed').
      from('person').to('recipe').
      materializedView('person__reviewed__recipe_by_person_person_id_stars').
      ifNotExists().
      partitionBy(OUT, 'person_id').
      partitionBy('stars').
      clusterBy(IN, 'recipe_id', Asc).
      create()
  6. Create the required materalized view index for the edge label person->reviewed->recipe by applying the index analyzer:

    schema.indexFor(g.V().hasLabel('person').outE('reviewed').has('stars', 5)).apply()
    ==>Creating the following indexes:
    schema.edgeLabel('reviewed').from('person').to('recipe').materializedView('person__reviewed__recipe_by_person_person_id_stars').ifNotExists().partitionBy(OUT, 'person_id').partitionBy('stars').clusterBy(IN, 'recipe_id', Asc).create()
    OK
  7. Materialized view indexes for edge labels can be made manually:

    schema.edgeLabel('reviewed').
      from('person').to('recipe').
      materializedView('person__reviewed__recipe_by_person_person_id_year').
      ifNotExists().
      partitionBy(OUT, 'person_id').
      clusterBy('year', Asc).
      clusterBy(IN, 'recipe_id', Asc).
      create()

    In this case, the index is created to discover recipe reviews that occur before or after a particular date.

  8. Secondary indexes

  9. Discover a required secondary index for the vertex label recipe based on the cuisines stored in a collection of the recipe using the index analyzer steps indexFor() and analyze():

    schema.indexFor(g.V().has('recipe', 'cuisine', contains('French')).values('name')).analyze()
    ==>Traversal requires that the following indexes are created:
    schema.vertexLabel('recipe').secondaryIndex('recipe_2i_by_cuisine').ifNotExists().by('cuisine').indexValues().create()
  10. Create the required secondary index for the vertex label recipe by applying the index analyzer:

    schema.indexFor(g.V().has('recipe', 'cuisine', contains('French')).values('name')).apply()
    ==>Creating the following indexes:
    schema.vertexLabel('recipe').secondaryIndex('recipe_2i_by_cuisine').ifNotExists().by('cuisine').indexValues().create()
    OK
  11. Secondary indexes for vertex labels can be made manually:

    schema.vertexLabel('person').
      secondaryIndex('person_2i_by_nickname').
      ifNotExists().
      by('nickname').
      indexValues().
      create()
    schema.vertexLabel('person').
      secondaryIndex('person_2i_by_country').
      ifNotExists().
      by('country').
      indexValues().
      create()
  12. Search indexes

  13. Discover a search index for the vertex label recipe that requires a tokenized search of the property instructions using the index analyzer steps indexFor() and analyze():

    schema.indexFor(g.V().has('recipe', 'instructions', token('Saute'))).analyze()
    ==>Traversal requires that the following indexes are created:
    schema.vertexLabel('recipe').searchIndex().ifNotExists().by('instructions').create()
  14. Create the required search index for the edge label recipe by applying the index analyzer:

    schema.indexFor(g.V().has('recipe', 'instructions', token('Saute'))).apply()
    ==>Creating the following indexes:
    schema.vertexLabel('recipe').searchIndex().ifNotExists().by('instructions').create()
    OK
  15. Search indexes for vertex labels and edge labels can be made manually:

    schema.vertexLabel('recipe').
      searchIndex().
      ifNotExists().
      by('instructions').asText().
      by('name').
      by('cuisine').
      waitForIndex(30).
      create()
    // schema.indexFor(g.V().has('book', 'publish_year', neq(1960))).analyze()
    // schema.indexFor(g.V().has('book', 'publish_year', eq(1961))).analyze()
    schema.vertexLabel('book').
      searchIndex().
      ifNotExists().
      by('name').
      by('publish_year').
      create()
    schema.vertexLabel('store').
      searchIndex().
      ifNotExists().
      by('name').
      create()
    
    schema.vertexLabel('home').
      searchIndex().
      ifNotExists().
      by('name').
      create()
    
    schema.vertexLabel('fridge_sensor').
      searchIndex().
      ifNotExists().
      by('city_id').
      by('sensor_id').
      by('name').
      create()
  16. Geospatial search indexes

  17. Discover a required secondary index for the edge label person->reviewed->recipe based on review star ratings stars using the index analyzer steps indexFor() and analyze():

    schema.indexFor(g.V().hasLabel('location').has('geo_point', Geo.inside(Geo.point(-110,30),20, Geo.Unit.DEGREES)).values('name')).analyze()
    ==>Traversal requires that the following indexes are created:
    schema.vertexLabel('location').searchIndex().ifNotExists().by('geo_point').create()
  18. Create the required materalized view index for the edge label person->reviewed->recipe by applying the index analyzer:

    schema.indexFor(g.V().hasLabel('location').has('geo_point', Geo.inside(Geo.point(-110,30),20, Geo.Unit.DEGREES)).values('name')).apply()
    ==>Creating the following indexes:
    schema.vertexLabel('location').searchIndex().ifNotExists().by('geo_point').create()
    OK

    As with the other index types, geospatial search indexes can be created manually.

  19. inverse() edge indexes

  20. Create a required inverse() edge index for the edge label person->created->recipe:

    schema.edgeLabel('created').
        from('person').to('recipe').
        materializedView('person_recipe').
        ifNotExists().
        inverse().
        create()
    ==> OK

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com