Index types and use cases

The data stored in CQL tables can be queried by a variety of methods. The main method uses the partition key defined for a table, which is called primary indexing. To query non-primary key columns efficiently, you must create secondary indexes of those columns. Secondary indexing uses fast, efficient lookup of data that matches a given condition. You must create an index on a column to efficiently query the data in that column.

Indexing works best when there is a moderate cardinality of the indexed values, meaning there is a good variety of values in the rows but the rows aren’t excessively unique. The more unique values that exist in a particular column, the more overhead, on average, is required to query and maintain the index. For example, indexing a column where almost every value is different is inefficient and resource intensive. In contrast, a Boolean column with only two possible values is not useful for queries.

Think of a secondary index as a filter: You want the filter to be diverse enough to intelligently narrow the scope of the results without being so specific that it isn’t reusable.

Indexing support by product
Indexing type Astra DB Serverless Astra Managed Clusters Apache Cassandra® DataStax Enterprise (DSE) Hyper-Converged Database (HCD)

Primary indexing (primary key)

Supported

Supported

Supported

Supported

Supported

Storage-attached indexing (SAI)

Supported

Supported

Supported (5.0)

Supported (6.8, 6.9)

Supported

Original secondary indexing (2i)

Not supported

Supported

Supported

Supported

Supported

SSTable-attached indexing (SASI)

Not supported

Not supported

Experimental, not recommended

Experimental (5.1, 6.8), not recommended

Not supported

DSE Search indexing

Not supported

Supported

Not supported

Supported

Not supported

Primary indexing

The primary index is the partition key column. All tables have a primary index.

A Cassandra storage engine uses the partition key to store rows of data, and the most efficient and fast lookup of data matches the partition key.

Secondary indexing

There are several types of secondary indexing available, but they aren’t interchangeable. Furthermore, not all secondary index types are supported by every database platform.

Secondary indexes are optional and must be created explicitly. Don’t create secondary indexes for every column; only index columns relevant to the queries expected by your data model.

Expect an increase of 2x latency and throughput on indexed tables compared to non-indexed tables.

Secondary indexing (2i)

2i are the original built-in indexes for CQL. These indexes are a local index, stored in a hidden table on each node of a CQL cluster, separate from the table that contains the values being indexed.

Due to potential performance degradation, this indexing method is only recommended when used in conjunction with a partition key.

Performance degradation with inefficient or excessive use of 2i

Indexing is never free: The more you add, the more you impact write performance. In Cassandra databases, this manifests as write-amplification issues. When a mutation on an indexed column occurs, an indexing operation triggers reindexing data in a separate index file.

More indexes on a CQL table can dramatically increase disk activity during write operations. If a single Cassandra node gets too many writes, there is saturated disk activity. This activity destabilizes individual nodes, creating Cassandra cluster problems.

For this reason, 2i should be used sparingly. Index size is fairly linear, but it can be difficult to plan for the amount of disk space needed in an active cluster for storing and re-indexing indexes.

Don’t use a 2i index in the following situations:

For more information about creating and using 2i, see Create and use secondary indexes (2i).

SSTable-attached indexing (SASI)

SASI is not recommended. It is only available for backward compatibility in certain database platforms and versions.

SASI uses indexes for non-partition columns, and creates an index file for each SSTable that store the rows of data. SASI is the most appropriate indexing method for LIKE full text searches.

SSTable-attached Secondary Indexing (SASI) wasn’t designed as a general indexing method. Instead it was created to address two query types:

  • Finding rows based on partial data matching, such as wildcard or LIKE queries.

  • Range queries on sparse data, specifically timestamps, such as queries for all records within a given time range.

However, it suffers from performance issues due to storage requirements for these indexes. In addition, SASI is marked as experimental and hasn’t received feature improvements given the existence of other generalized indexing methods.

SASI indexes are deprecated in DSE 6.8 and 5.1, but they are enabled by default. For more information about creating and using SASI, see Create SASI index.

DSE Search indexing

DSE Search is only available in Astra Managed Clusters, DSE 5.1, DSE 6.8, and DSE 6.9.

DSE Search isn’t required, but it can be useful for certain full-text search applications. If you are using DSE Search, use DSE Search indexing for your full text search queries.

The DSE Search indexing method is specific to Apache Solr-Lucene searches because DSE Search supports simple keyword searches as well as complex queries on multiple fields with faceted search results. This indexing method features tokenized text search for use with analyzers. For more information, see Search index commands.

Build and maintain secondary indexes

All secondary indexes (2i, SAI, and SASI) are built in the background automatically, without blocking reads or writes.

Client-maintained tables as indexes must be created manually. For example, if an age column was indexed by creating a table named by_age, then your client application must populate the table with supporting data from other tables, such as a name table that uses id as the primary key.

To perform a hot rebuild of an index, use the nodetool rebuild_index command.

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM