Indexing
Explain indexes and how they affect DSE Graph performance.
Indexes play a significant role in making DSE Graph queries performant. Graph queries that must traverse the entire graph to find information will have poor performance, which explains why full-scan queries are disallowed in production environments. Two aspects of querying a graph can be improved with indexing: the initial vertex or vertices from which to start a traversal, and the narrowing of the edges and vertices to traverse from this starting point. DSE Graph implements two types of indexes, global indexes and vertex-centric indexes (VCIs) to address these different aspects of query processing. Global indexes are used to find the starting point for a query and involve finding a matching vertex property value. Vertex-centric indexes are used to narrow down the scope of a query after a starting point is defined.
Global indexing overview
g.V(['~label':'person', 'personId':1])g.V().has('person', 'name', 'Julia Child')name is not part of the vertex id, an index is required to match the search
    conditions with the correct vertex, and that index is a global index.Global indexing in DSE Graph can be accomplished with one of three DSE indexing methods: a materialized view (MV), a search index, or a secondary index.
selectivity = ( cardinality / number of rows ) * 100%Search indexes are used when textual, numeric or geospatial indexing are required and rely on DSE Search. Since graph data is stored in DSE database tables, one search core is available per vertex label. For each vertex label that will be indexed with search, all properties must be added to a single search index named search. Because search is implemented with DSE Search, all data types can be indexed. For two indexing options, full text and string, the property key must be defined, as different indexing results. Full text indexing performs tokenization and secondary processing such as case normalization. Full text indexing is useful for queries where partial match of text is required, and lends itself to regular expressing (regEx) searching. String indexing is useful for queries where an exact string is sought and no tokenization is required, similar to Solr faceting. This type of index is best for low selectivity, but lends itself to fuzzy matching for both tokenized and non-tokenized indexing.
Secondary indexing in DSE Graph follows the same rule of thumb as DSE secondary indexing. This type of index is meant for lower cardinality values, or alternatively, for low selectivity values. The number of values for indexing should number in the tens to hundreds at most; for instance, searching by country is a good candidate for secondary indexing. In addition, only equality conditions can be used to match values, and no ordering or range queries on values can be used. If more complex value matching is required, search indexes are the superior choice.
| Index type | Use | 
|---|---|
| Materialized view | Most efficient index for high cardinality, high selectivity vertex properties and equality predicates. | 
| Secondary index | Efficient index for low cardinality, low selectivity vertex properties and equality predicates. | 
| Search index | Efficient and versatile index for vertex properties with a wide range of cardinality and selectivity. A search index supports a variety of predicates: 
 | 
Composite index keys are not currently supported in DSE Graph.
Vertex-centric indexing (VCI) overview
g.V().has('person', 'name', 'Julia Child').outE('created').has('create_date', gt(1960-01-01)) g.V().has('person', 'name', 'Fritz Streiff').properties('country').has('start_year', order().by(decr))Vertex-centric indexing in DSE Graph is accomplished with materialized views (MVs) for both edge and property indexes, and have the same properties as described above for global indexes.
Indexing best practices
The most important fact to remember is that a search index is the only choice for indexing two
    or more properties that define the starting point for a query. Multiple materialized view or
    secondary indexes cannot be used for global indexing. For instance, g.V().has('person',
     'gender', 'F').has('person', 'country', 'France') will only use one index, not both, if
    the indexes are materialized view or secondary indexes. If a search index is defined, both
     properties,country and gender, are used. Once the starting
    point is defined, a vertex-centric index can be used to narrow the query.
More than one index can be created on the same property, such as creating both a materialized
    view (MV) index and a search index on the property amount. The DSE Graph query
    optimizer automatically uses the appropriate index when processing a query; designation of an
    index type to use is not a feature. The order of preference that DSE Graph uses is MV index >
    secondary index > DSE Search index to ensure best performance. However, choosing the optimal
    type of index is key to good performance. For instance, it is important to understand the
    limitations of materialized views, and base the number of MV indexes on that understanding. See
     . Different index types may be created on different properties as
    appropriate, based on the selectivity. In general, secondary indexes in DSE Graph are limited in
    usefulness, for the same reasons that constrict their general use in DSE. Materialized view
    indexing should be the first choice, unless textual search is required and a search index is
     selected.
If a search index is created, be aware that building the index can take time, and that until the index is available, queries that depend on the index can fail. Applications that create schema, immediately followed by data insertion that require search indexes will likely experience errors. Also, queries that use search indexes should be run on DSE Search-enabled nodes in the cluster. Search indexes also require extra resources. Each index allocates a minimum of 256MB of memory by default, and each index will require two physical cores. For a typical 32GB node, 16 search indexes would be a reasonable number to create.
tokenRegex will display case insensitivity in queries, whether
     a search index is used or not.Textual search indexes are by default indexed in both tokenized
                (TextField) and non-tokenized (StrField) forms. This means that all textual
                predicates (token, tokenPrefix, tokenRegex, eq, neq, regex, prefix) will be usable
                with all textual vertex properties indexed. Practically, search indexes should be
                created using the asString() method only in cases where there is
                absolutely no use for tokenization and text analysis, such as for inventory
                categories (silverware, shoes, clothing). The asText() method is
                used if searching tokenized text, such as long multi-sentence descriptions. The
                query optimizer will choose whether to use analyzed or non-analyzed indexing based
                on the textual predicate used.
It is possible to modify search index schema to change search characteristics. Although DSE Graph will not overwrite these out-of-band changes, it is recommended that you do not add or remove fields in this manner - only DSE Graph commands should be used. The general use of this feature is mainly to change the behavior of a search, such as adding case sensitivity to a type of search.
