CREATE CUSTOM INDEX
Supported databases:
-
DataStax DataStax Astra DB (Vector)
-
DataStax DataStax Astra DB DataStax Astra DB DataStax Astra DB Serverless
-
DataStax DataStax Astra DB DataStax Astra DB DataStax Astra DB Classic
-
HCD 1.0
-
DataStax Enterprise 6.9.0 and later
-
DataStax Enterprise 6.8.3 and later
Creates a Storage-Attached Indexing (SAI) index. You can create multiple secondary indexes on the same database table, with each SAI index based on any column in the table. All column date types except the following are supported for SAI indexes:
-
counter
-
geospatial types:
PointType
,LineStringType
,PolygonType
-
non-frozen user-defined type (UDT)
However, you can define an SAI index on one of the columns in a table’s composite partition key, i.e., a partition key comprised of multiple columns. If you need to query based on one of those columns, an SAI index is a helpful option. In fact, you can define an SAI index on each column in a composite partition key, if needed.
Defining one or more SAI indexes based on any column in a database table (with the rules noted above) subsequently gives you the ability to run performant queries that use the indexed column to filter results.
Since indexes are created at the keyspace level and not at the table level.
If you use the |
See the SAI section.
Synopsis
CREATE CUSTOM INDEX [ IF NOT EXISTS ] [ <index_name> ] ON [ <keyspace_name>.]<table_name> (<column_name>) | [ (KEYS(<map_name>)) ] | [ (VALUES(<map_name>)) ] | [ (ENTRIES(<map_name>)) ] USING 'StorageAttachedIndex' [ WITH OPTIONS = { <option_map> } ] ;
Syntax legend
Syntax conventions | Description |
---|---|
UPPERCASE |
Literal keyword. |
Lowercase |
Not literal. |
|
Variable value. Replace with a user-defined value. |
|
Optional.
Square brackets ( |
|
Group.
Parentheses ( |
|
Or.
A vertical bar ( |
|
Repeatable.
An ellipsis ( |
|
Single quotation ( |
|
Map collection.
Braces ( |
Set, list, map, or tuple.
Angle brackets ( |
|
|
End CQL statement.
A semicolon ( |
|
Separate the command line options from the command arguments with two hyphens ( |
|
Search CQL only: Single quotation marks ( |
|
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files. |
- index_name
-
Optional identifier for index. If no name is specified, the default is used,
<table_name>_<column_name>_idx
. Enclose in quotes to use special characters or to preserve capitalization.Index names are unique per keyspace. The index name must be a unique identifier for the index for each table within a keyspace. Enclose in quotes to use special characters or preserve capitalization. If you do not specify an index name, CQL generates one for you, with the pattern <table_name>_<column_name>_idx. This requirement is true for all indexes.
- column_name
-
The name of the table column on which the SAI index is being defined. SAI allows only alphanumeric characters and underscores in names. SAI returns
InvalidRequestException
if you try to define an index on a column name that contains other characters, and does not create the index. - map_name
-
Used with collections, identifier of the
map_name
specified inCREATE TABLE
…map(<map_name>)
. The regular column syntax applies for collection typeslist
andset
. - option_map
-
Define options in JSON simple format.
Option Description case_sensitive
Ignore case in matching string values. Default:
true
.normalize
When set to
true
, perform Unicode normalization on indexed strings. SAI supports Normalization Form C (NFC) Unicode. When set totrue
, SAI normalizes the different versions of a given Unicode character to a single version, retaining all the marks and symbols in the index. For example, SAI would change the character Å (U+212B) to Å (U+00C5).When implementations keep strings in a normalized form, equivalent strings have a unique binary representation. See Unicode Standard Annex #15, Unicode Normalization Forms.
Default:
false
.ascii
When set to
true
, SAI converts alphabetic, numeric, and symbolic characters that are not in the Basic Latin Unicode block (the first 127 ASCII characters) to the ASCII equivalent, if one exists. For example, this option changes à to a. Default:false
.similarity_function
Vector search relies on computing the similarity or distance between vectors to identify relevant matches. The similarity function is used to compute the similarity between two vectors. Valid options are: EUCLIDEAN, DOT_PRODUCT, COSINE Default:
COSINE
Query operators
SAI supports the following query operators for tables with SAI indexes:
-
Numerics:
=
,<
,>
,⇐
,>=
,AND
,IN
,OR
(for vector databases) -
Strings:
=
,AND
,IN
-
Collections:
=
CONTAINS
,CONTAINS KEY
SAI does not supports the following query operators for tables with SAI indexes:
-
Strings or Numerics:
LIKE
,OR
(for non-vector databases)
Examples
These examples define SAI indexes for the cycling.cyclist_semi_pro
table, which is demonstrated in the SAI quickstart.
CREATE CUSTOM INDEX lastname_idx ON cycling.cyclist_semi_pro (lastname)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX age_idx ON cycling.cyclist_semi_pro (age)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX country_idx ON cycling.cyclist_semi_pro (country)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX registration_idx ON cycling.cyclist_semi_pro (registration)
USING 'StorageAttachedIndex';
For sample queries that find data in cycling.cyclist_semi_pro
via these sample SAI indexes, see Submit CQL queries.
Also refer Examine SAI column index and query rules.
SAI collection map examples with keys, values, and entries
The following examples demonstrate using collection maps of multiple types (keys
, values
, entries
) in SAI indexes.
For related information, see Creating collections and Using map type.
Also refer to the SAI collection examples of type list and set in this topic.
If you are logged into the DataStax Astra CQL Console, you already have a
keyspace defined for your cloud database; skip this next step and specify
your keyspace name with the |
Next, use the keyspace:
USE demo3;
Create an audit
table, with a collection map named text_map
:
CREATE TABLE audit ( id int PRIMARY KEY , text_map map<text, text>);
Create multiple SAI indexes on the same map
column, each using KEYS
, VALUES
, and ENTRIES
.
CREATE CUSTOM INDEX ON audit (KEYS(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (VALUES(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (ENTRIES(text_map)) USING 'StorageAttachedIndex';
Insert some data:
INSERT INTO audit (id, text_map) values (1, {'Carlos':'Perotti', 'Marcel':'Silva'});
INSERT INTO audit (id, text_map) values (2, {'Giovani':'Pasi', 'Frances':'Giardello'});
INSERT INTO audit (id, text_map) values (3, {'Mark':'Pastore', 'Irene':'Cantona'});
Query all data:
-
Query
-
Result
SELECT * FROM audit;
id | text_map
----+---------------------------------------------
1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}
2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}
3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}
(3 rows)
Query using the SAI index for specific entries in the map
column:
-
Query
-
Result
SELECT * FROM audit WHERE text_map['Irene'] = 'Cantona' AND text_map['Mark'] = 'Pastore';
id | text_map
----+-----------------------------------------
3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}
(1 rows)
Query using the SAI index for specific keys in the map
column using CONTAINS KEY
:
-
Query
-
Result
SELECT * FROM audit WHERE text_map CONTAINS KEY 'Giovani';
id | text_map
----+---------------------------------------------
2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}
(1 rows)
Query using the SAI index for specific values in the map
column with CONTAINS
:
-
Query
-
Result
SELECT * FROM audit WHERE text_map CONTAINS 'Silva';
id | text_map
----+------------------------------------------
1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}
(1 rows)
Remember that in CQL queries using SAI indexes, the CONTAINS
clauses are supported with, and specific to:
-
SAI collection maps with
keys
,values
, andentries
-
SAI collections with
list
andset
types
SAI collection examples with list and set types
These examples demonstrate using collections with the list
and set
types in SAI indexes.
For related information, see:
If you have not already, create the keyspace.
CREATE KEYSPACE IF NOT EXISTS demo3 WITH REPLICATION =
{'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;
Using the list type
Create a calendar
table with a collection of type list
.
CREATE TABLE calendar (key int PRIMARY KEY, years list<int>);
Create an SAI index using the collection’s years
column.
CREATE CUSTOM INDEX ON calendar(years) USING 'StorageAttachedIndex';
Insert some random int
list data for years
, just for demo purposes.
Notice the
|
Query with CONTAINS
example:
-
Query
-
Result
SELECT * FROM calendar WHERE years CONTAINS 1990;
key | years
-----+--------------
0 | [1990, 1996]
2 | [2001, 1990]
(2 rows)
This example created the calendar
table with years list<int>
.
Of course, you could have created the table with years list<text>
, for example, inserted 'string' values, and queried on the strings.
Using the set type
Now create a calendar2
table with a collection of type set
.
CREATE TABLE calendar2 (key int PRIMARY KEY, years set<int>);
Create an SAI index using the collection’s years
column — this time for the calendar2
table.
CREATE CUSTOM INDEX ON calendar2(years) USING 'StorageAttachedIndex';
Insert some random int
set data for years
, again just for demo purposes.
Notice the
|
Query with CONTAINS
example from the list:
-
Query
-
Result
SELECT * FROM calendar2 WHERE years CONTAINS 1990;
key | years
-----+--------------------
0 | {1990, 1996}
2 | {1990, 2001, 2020}
(2 rows)
Remember that in CQL queries using SAI indexes, the CONTAINS
clauses are supported with, and specific to:
-
SAI collection maps with
keys
,values
, andentries
-
SAI collections with
list
andset
types