Create SAI index
Create indexes on one or more columns after defining a table. Secondary indexes created with SAI can be used to query a table using any column other than the table’s partition key. If a composite partition key is defined, a SAI index can be created on any of the partition key columns.
If you decide to create an index, always create either a secondary index (2i) or a storage-attached index (SAI) on a table column, but not both. CQL will be harder to tune and understanding performance will be more difficult if you create both types of indexes on the same table. See the difference between these index types in the overview. |
To create an SAI index, define the table name and column name for the column to be indexed. Index name is optional. Index names are unique per keyspace. The index name must be a unique identifier for the index for each table within a keyspace. Enclose in quotes to use special characters or preserve capitalization. If you do not specify an index name, CQL generates one for you, with the pattern <table_name>_<column_name>_idx. This requirement is true for all indexes.
Since indexes are created at the keyspace level and not at the table level.
If you use the |
The full syntax for creating an SAI index in CQL is as follows:
CREATE CUSTOM INDEX index_name ON keyspace_name.table_name (column_name) USING 'StorageAttachedIndex' ;
However, some products have added syntactical sugar to make it easier to create SAI indexes. For example, you can create an SAI index using the following syntax in Hyper-Converged Database and Apache Cassandra 5.0:
CREATE INDEX index_name ON keyspace_name.table_name (column_name) USING 'SAI' ;
Apache Cassandra 5.0 has additional syntactical sugar that allows you to create an SAI index using the following syntax:
CREATE INDEX index_name ON keyspace_name.table_name (column_name);
if the default_secondary_index
parameter in the cassandra.yaml
file is set to SAI
.
Prerequisites
To create a simple SAI index:
CREATE CUSTOM INDEX lastname_idx ON cycling.cyclist_semi_pro (lastname)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX age_idx ON cycling.cyclist_semi_pro (age)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX country_idx ON cycling.cyclist_semi_pro (country)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX registration_idx ON cycling.cyclist_semi_pro (registration)
USING 'StorageAttachedIndex';
To create an SAI index:
CREATE CUSTOM INDEX id_idx ON cycling.cyclist_alt_stats (id)
USING 'StorageAttachedIndex';
Vector index options
This example uses the following table:
CREATE TABLE IF NOT EXISTS cycling.comments_vs (
record_id timeuuid,
id uuid,
commenter text,
comment text,
comment_vector VECTOR <FLOAT, 5>,
created_at timestamp,
PRIMARY KEY (id, created_at)
)
WITH CLUSTERING ORDER BY (created_at DESC);
Vector indexes support source_model
and similarity_function
options for the CREATE INDEX
command.
Parameter | Description | Default |
---|---|---|
|
Configures the index for optimal performance for your vectors.
Options are: |
|
|
Specified similarity function for your index.
If you selected a |
|
This example creates an index on the comment_vector
column with the similarity function set to dot_product
:
CREATE CUSTOM INDEX comment_sim_function_idx
ON cycling.comments_vs (comment_vector) USING 'StorageAttachedIndex'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT'};
This example creates an index on the comment_vector
column with the source model set to bert
:
CREATE CUSTOM INDEX comment_source_model_idx
ON cycling.comments_vs (comment_vector) USING 'StorageAttachedIndex'
WITH OPTIONS = { 'source_model': 'bert'};