Using search indexes

Using search indexes for graph traversals.

DSE Graph can use search indexes that take advantage of DSE Search functionality for efficient traversal queries. DSE Search uses a modified Apache Solr to create the search indexes. Graph search indexes can be created using asText() for full text tokenized search or asString() for exact string matching. Each option has advantages, depending on the type of search that will be required in graph traversals.

It is important to note that these traversal queries will fail if a search index is not created previous to using the query. Create search indexes during schema creation before inserting data and querying the graph. Search indexes will only be created if DSE Search is started in conjunction with DSE Graph.

In general, the traversal step will involve a vertex label and can include a property key and a particular property value. In a traversal, the step following g.V() is generally the step in which an index will be consulted. If a mid-traversal V() step is called, then an additional indexed step can be consulted to narrow the list of vertices that will be traversed.

Property key indexes defined with asText() can use the following options for search: Property key indexes defined with asString() can use the following options for search:
Note: The eq() search cannot be used with property key indexes created with asText()because they contain tokenized data and are therefore not suitable for exact text matches.

Procedure

Creating a search index

  • An example search index from Creating indexes for vertex label recipe that will be used for all examples below:
    schema.vertexLabel('recipe').index('search').search().
                  by('instructions').asText().
                  by('name').asString().add()

    This search index uses DSE Search to index instructions as full text using tokenization, and name as a string.

Search using token() methods on full text

  • In a traversal query, use a token search to find list the names of all recipes that have the word Saute in the instructions. The method Search.token() is used with a supplied word.
    g.V().has('recipe','instructions', Search.token('Saute')).values('name')

Search using tokenPrefix() methods on full text

  • In a traversal query, use a token prefix search to list the names of all recipes that have a word that includes a prefix of Sea in the instructions. The method Search.tokenPrefix() is used with a supplied prefix (a set of alphanumeric characters).
    g.V().hasLabel('recipe').has('instructions', Search.tokenPrefix('Sea')).values('name','instructions')

    Two recipes are returned, one with the word Season in the instructions, and one with the word seasonings in the instructions. Case is insensitive in Search.tokenPrefix() indexing.

Search using tokenRegex() methods on full text

  • In a traversal query, use a token regular expression (regex) search to find all recipes that have a word that includes the regular expression specified. The regex, .*sea*in.*, looks for the letters sea preceded by any number of other characters and followed by any number of other characters until the letters in are found and also followed by any number of other characters in the instructions and list the recipe names. The method Search.tokenRegex() is used with a supplied regex.
    g.V().hasLabel('recipe').has('instructions', Search.tokenRegex('.*sea.*in.*')).values('name','instructions')

    Note that in this query, only the Oysters Rockefeller recipe is returned because the word Season in the Roast Pork Loin recipe does not meet the requirements for the regular expression.

Search using eq() on non-token methods on strings

  • In a traversal query, use a non-token search to list all recipes that have Carrot Soup in the recipe name. The method eq() is used with a supplied name.
    g.V().hasLabel('recipe').has('name', eq('Carrot Soup')).values('name')

    The match is found for the full author name listed. Note that neq() can also be used to find all strings that do not match the specified string.

  • In a traversal query, use a non-token search to list all recipes that have Carrot in the recipe name. The method eq() is used with a supplied name.
    g.V().hasLabel('recipe').has('name', eq('Carrot')).valueMap()

    No match is found, because only a partial name was specified. For asString() indexes, the string must match.

Search using prefix() on non-token methods on strings

  • In a traversal query, use a non-token search to find all authors that have a name beginning with the letter R. The method prefix() is used with a supplied string.
    g.V().hasLabel('recipe').has('name', Search.prefix('R')).values('name')

    Matches are found for each author name that begins with R, provided the recipe name was designated with asString() in the search index.

Search using regex() on non-token methods on strings

  • In a traversal query, use a non-token search to find all recipes that have a name that includes a specified regular expression. The method regex() is used with a supplied regex.
    g.V().hasLabel('recipe').has('name', Search.regex('.*ee.*')).values('name')

    Matches are found for each author name that include the regex .*ee.* to find all strings that include ee preceded and followed by any number of other characters, provided the recipe name was designated with asString() in the search index.

Using two search indexes for a single traversal query

  • Create a second search index like an example search index from Creating indexes for vertex label author.
    schema.vertexLabel('author').index('search').search().
                by('name').asString().
                by('nickname').ifNotExists().add()

    This search index will use DSE Search to index nickname as full text using tokenization, and name as a string.

  • This traversal query demonstrates a mid-traversal V() that allows a search index for author as well as a search index for recipe to be used to execute the query. The first index uses a Search.tokenRegex() to find recipe instructions that start with the word Braise; this part of the query is labeled as r for use later in the query. Then the search index for author is searched for an author name that starts with the letter J, and traversed through an outgoing edge to a vertex where the search found in the first part of the query is found with where(eq('r')).
    g.V().has('recipe', 'instructions', Search.tokenRegex('Braise.*')).as('r').
      V().has('author', 'name', Search.prefix('J')).out().where(eq('r')).values('name')

    This query traversal finds the recipe Beef Bourguignon authored by Julia Child, and illustrates some of the complexity that can be successfully used with search indexes.

Search using geospatial values

  • Geospatial search is used to discover geospatial relationships. Search indexes are used to make such searches possible. First, a search index must be created.
    schema.vertexLabel('FridgeSensor').index('search').search().
                    by('location').ifNotExists().add()
  • Some sample data will be helpful for understanding the search results. Two vertices are entered for fridge sensor:
    graph.addVertex(label, 'FridgeSensor', 'name', 'jones1', 'city_id', 100, 'sensor_id', '60bcae02-f6e5-11e5-9ce9-5e5517507c66', 'location', Geo.point(-118.359770, 34.171221))
    graph.addVertex(label, 'FridgeSensor', 'name', 'smith1', 'city_id', 100, 'sensor_id', '61deada0-3bb2-4d6d-a606-a44d963f03b5', 'location', Geo.point(-115.655068, 35.163427))
    

    The sensors are named and given a city ID and sensor ID in addition to the location with data type Point.

  • A query can find all sensors that meet the requirement of being inside the described polygon Distance that is designated as a circle with a center at (-110, 30) and a radius of 20 units with the method Geo.inside().
    Distance d = Geo.distance(-110,30,20)
    g.V().hasLabel('FridgeSensor').has('location', Geo.inside(d)).values('name')