Create SAI index
To create an SAI index, you must define the index name, table name, and column name for the column to be indexed.
Index names are unique per keyspace. The index name must be a unique identifier for the index for each table within a keyspace. This requirement is true for both vector and non-vector indexes. |
Prerequisites
To create a simple SAI index:
CREATE CUSTOM INDEX lastname_sai_idx ON cycling.cyclist_semi_pro (lastname)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX age_sai_idx ON cycling.cyclist_semi_pro (age)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX country_sai_idx ON cycling.cyclist_semi_pro (country)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX registration_sai_idx ON cycling.cyclist_semi_pro (registration)
USING 'StorageAttachedIndex';
For most SAI indexes, the column name is defined in the CREATE CUSTOM INDEX
statement that also uses USING 'StorageAttachedIndex'
.
The SAI index options are defined in the WITH OPTIONS
clause.
The case_sensitive
option is set to false
to allow case-insensitive searches.
The normalize
option is set to true
to allow searches to be normalized for Unicode characters.
The ascii_only
option is set to true
to allow searches to be limited to ASCII characters.
The map
collection data type is the one exception, as shown in the example below.
Partition key SAI error
SAI indexes cannot be created on the partition key, as a primary index already exists and is used for queries. If you attempt to create an SAI on the partition key column, an error will be returned:
-
CQL
-
Result
CREATE CUSTOM INDEX ON demo2.person_id_name_primarykey (id)
USING 'StorageAttachedIndex';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot create secondary index on the only partition key column id"
map
collection in SAI index
Map collections do have a different format than other SAI indexes:
// Create an index on a map key to find all cyclist/team combos for a year
CREATE INDEX IF NOT EXISTS team_year_keys_idx
ON cycling.cyclist_teams ( KEYS (teams) );
// Create an index on a map key to find all cyclist/team combos for a year
CREATE INDEX IF NOT EXISTS team_year_values_idx
ON cycling.cyclist_teams ( VALUES (teams) );
// Create an index on a map key to find all cyclist/team combos for a year
CREATE INDEX IF NOT EXISTS team_year_entries_idx
ON cycling.cyclist_teams ( ENTRIES (teams) );
similarity-function for vector search
This example uses the following table:
CREATE TABLE IF NOT EXISTS cycling.comments_vs (
record_id timeuuid,
id uuid,
commenter text,
comment text,
comment_vector VECTOR <FLOAT, 5>,
created_at timestamp,
PRIMARY KEY (id, created_at)
)
WITH CLUSTERING ORDER BY (created_at DESC);
To check if comment_vector
has a particular similarity function set, use the similarity-function
option set to one of the supported similarity functions: DOT_PRODUCT, COSINE, or EUCLIDEAN. The default similarity function is COSINE.
This index creates an index on the comment_vector
column with the similarity function set to DOT_PRODUCT:
CREATE CUSTOM INDEX sim_comments_idx
ON cycling.comments_vs (comment_vector)
USING 'StorageAttachedIndex'
WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT'};
See CREATE CUSTOM INDEX for more information about creating SAI indexes.