Caching edges and properties

Caching can improve query performance and is configurable. DSE Graph has two types of cache: adjacency list cache and index/property cache. Either edges or properties can be cached using the schema API vertexLabel() method with the cache() option. Caching can be configured for all edges, all properties, or a filtered set of edges. Vertices are not cached directly, but caching properties and edges that define the relationship between vertices essentially accomplishes the same operation.

Property caching is enabled if indexes exist and are used in the course of queries. Full graph scan queries will not be cached. If an index does not exist, then caching does not occur. Adjacency list caching is enabled if caching is configured for edges.

The caches are local to a node and data is loaded into cache when it is read with a query. Both caches are set to a default size of 128 MB in the dse.yaml file. The settings are adjacency_cache_size_in_mb and index_cache_size_in_mb. Both caches utilize off-heap memory implemented as Least Recently Used (LRU) cache.

Caching is intended to help make queries more efficient if the same information is required in a later query. For instance, caching the calories property for meal vertices improves the retrieval of a query asking for all meals with a calorie count less than 850 calories.

Graph cache is local to each node in the cluster, so the cached data can be different between nodes. Thus, a query can use cache on one node, but not on another. The caches are updated only when the data is not found. Graph caching does not have any means of eviction. No flushing occurs, and the cache is not updated if an element is deleted or modified. The cache only evicts data based on the time-to-live (TTL) value set when the cache is configured for an element. Set a low TTL value for elements (property keys, vertex labels, edge labels) that change often to avoid stale data.

Graph cache is useful for rarely changed graph data. The queries that use graph cache effectively are queries that repeatedly run. If the queries run differ even in the sort order, the graph cache will not be used to reduce the query latency. For instance, caching the calories property for meal vertices improves the retrieval of a query asking for all meals with a calorie count of less than 850 calories if this query is repeated.

All properties for all meal vertices are cached along with calories.

  • Cache all properties for author vertices up to an hour (3600 seconds):

    schema.vertexLabel('author').cache().properties().ttl(3600).add()

    Enabling property cache causes index queries to use IndexCache for the specified vertex label.

  • Cache both incoming and outgoing created edges for author vertices up to a minute (60 seconds):

    schema.vertexLabel('author').cache().bothE('created').ttl(60).add()

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com