CREATE CUSTOM INDEX

Creates a storage-attached index.

Creates a Storage-Attached Indexing (SAI) index. You can create multiple secondary indexes on the same database table, with each SAI index based on any column in the table. Exception: there is no need to define an SAI index based on the partition key when it's comprised of only one column; in this case, SAI issues an invalid query message.

Supported databases:
  • DataStax Astra
  • DataStax Enterprise 6.8.3 and later

You can also define an SAI index using one of the columns in the table's composite partition key. A composite partition key means that the partition is based on multiple columns. In this case with an SAI index, you would specify just one of the columns that comprises the composite partition key. You could define a separate SAI index that specifies another single column from the composite partition key.

Defining one or more SAI indexes based on any column in a database table (with the rules noted above) subsequently gives you the ability to run performant queries that specify the indexed column. See the SAI guide.

Synopsis

CREATE CUSTOM INDEX [ IF NOT EXISTS ] [ index_name ]
  ON [ keyspace_name.]table_name (column_name)
     [ (keys(map_name)) ] 
     [ (values(map_name)) ] 
     [ (entries(map_name)) ]
  USING 'StorageAttachedIndex' 
  [ WITH OPTIONS = { option_map } ] ;
Table 1. Legend
Syntax conventions Description
UPPERCASE Literal keyword.
Lowercase Not literal.
Italics Variable value. Replace with a user-defined value.
[] Optional. Square brackets ( [] ) surround optional command arguments. Do not type the square brackets.
( ) Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.
| Or. A vertical bar ( | ) separates alternative elements. Type any one of the elements. Do not type the vertical bar.
... Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.
'Literal string' Single quotation ( ' ) marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.
{ key : value } Map collection. Braces ( { } ) enclose map collections or key value pairs. A colon separates the key and the value.
<datatype1,datatype2> Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.
cql_statement; End CQL statement. A semicolon ( ; ) terminates all CQL statements.
[--] Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.
' <schema> ... </schema> ' Search CQL only: Single quotation marks ( ' ) surround an entire XML schema declaration.
@xml_entity='xml_entity_type' Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files.
index_name
Optional identifier for index. If no name is specified, the default is used, table_name_column_name_idx. Enclose in quotes to use special characters or to preserve capitalization.
column_name

The name of the table column on which the SAI index is being defined. SAI allows alphanumeric characters and underscores only. SAI returns InvalidRequestException if you try to define an index on a column name that contains other characters, and does not create the index.

map_name
Used with collections, identifier of the map_name specified in CREATE TABLE ... map(map_name). See SAI collection map examples with keys, values, and entries.
Tip: Different syntax applies for collections of type list and set, which are also supported with SAI indexes. In this topic, see SAI collection examples with list and set types.
option_map

Define options in JSON simple format.

Option Description
case_sensitive Ignore case in matching string values.

Default: true.

normalize When set to true, perform Unicode normalization on indexed strings.

SAI supports Normalization Form C (NFC) Unicode. When set to true, SAI normalizes the different versions of a given Unicode character to a single version, retaining all the marks and symbols in the index. For example, SAI would change the character Å (U+212B) to Å (U+00C5).

When implementations keep strings in a normalized form, equivalent strings have a unique binary representation. See Unicode Standard Annex #15, Unicode Normalization Forms.

Default: false.

ascii When set to true, SAI converts alphabetic, numeric, and symbolic characters that are not in the Basic Latin Unicode block (the first 127 ASCII characters) to the ASCII equivalent, if one exists. For example, this option changes à to a.

Default: false.

About query operators

Tip: SAI supports the following query operators for tables with SAI indexes:
  • Numerics: =, <, >, <=, >=, AND
  • Strings: =, CONTAINS, CONTAINS key, CONTAINS VALUES, AND

Not supported: LIKE, IN, OR

About collections

Tip: SAI supports collections of type map, list, and set. See the examples in this topic:

Examples

These examples define SAI indexes for the cycling.cyclist_semi_pro table, which is demonstrated in the SAI quick start.
CREATE CUSTOM INDEX lastname_sai_idx ON cycling.cyclist_semi_pro (lastname) 
USING 'StorageAttachedIndex' 
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'}; 

CREATE CUSTOM INDEX age_sai_idx ON cycling.cyclist_semi_pro (age) 
USING 'StorageAttachedIndex';

CREATE CUSTOM INDEX country_sai_idx ON cycling.cyclist_semi_pro (country) 
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'}; 

CREATE CUSTOM INDEX registration_sai_idx ON cycling.cyclist_semi_pro (registration) 
USING 'StorageAttachedIndex'; 

For sample queries that find data in cycling.cyclist_semi_pro via these sample SAI indexes, see Submit CQL queries. Also refer to Examine SAI column index and query rules.

SAI collection map examples with keys, values, and entries

The following examples demonstrate using collection maps of multiple types (keys, values, entries) in SAI indexes. For related information, see Creating collections and Using map type.

Also refer to the SAI collection examples of type list and set in this topic.

First, create the keyspace.
Note: If you are logged into the DataStax Astra CQL Console, you already have a keyspace defined for your cloud database; skip this next step and specify your keyspace name in the USE command.
CREATE KEYSPACE demo3 WITH replication = 
       {'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;
Create an audit table, with a collection map named text_map.
CREATE TABLE audit ( id int PRIMARY KEY , text_map map<text, text>);
Create multiple SAI collection maps on the same column, each using keys, values, and entries.
Note: Creating multiple SAI collection indexes with different map types on the same column requires DSE 6.8.4 or later. If you're using DSE 6.8.3, submit a DROP INDEX index-name; command before adding the next map type on the same column.
CREATE CUSTOM INDEX ON audit (keys(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (values(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (entries(text_map)) USING 'StorageAttachedIndex';
Insert some data:
INSERT INTO audit (id, text_map) values (1, {'Carlos':'Perotti', 'Marcel':'Silva'});
INSERT INTO audit (id, text_map) values (2, {'Giovani':'Pasi', 'Frances':'Giardello'});
INSERT INTO audit (id, text_map) values (3, {'Mark':'Pastore', 'Irene':'Cantona'});
Query all example:
SELECT * FROM audit;
 id | text_map
----+---------------------------------------------
  1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}
  2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}
  3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}

(3 rows)
Query example that uses the SAI index with collection map type entries:
SELECT * FROM audit WHERE text_map['Irene'] = 'Cantona' AND text_map['Mark'] = 'Pastore';
 id | text_map
----+-----------------------------------------
  3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}

(1 rows)
Query example with CONTAINS KEY that uses the SAI with collection map type keys:
SELECT * FROM audit WHERE text_map CONTAINS KEY 'Giovani';
 id | text_map
----+---------------------------------------------
  2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}

(1 rows)
Query example with CONTAINS that uses the SAI index with map type values:
SELECT * FROM audit WHERE text_map CONTAINS 'Silva';
 id | text_map
----+------------------------------------------
  1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}

(1 rows)
Tip: In CQL queries of database tables with SAI indexes, the CONTAINS clauses are supported with, and specific to:
  • SAI collection maps with keys, values, and entries
  • SAI collections with list and set types

SAI collection examples with list and set types

These examples demonstrate using collections with the list and set types in SAI indexes. For related information, see:
If you have not already, create the keyspace.
Note: If you are logged into the DataStax Astra CQL Console, you already have a keyspace defined for your cloud database; skip this next step and specify your keyspace name in the USE command.
CREATE KEYSPACE IF NOT EXISTS demo3 WITH replication = 
       {'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;

Using the list type

Create a calendar table with a collection of type list.
CREATE TABLE calendar (key int PRIMARY KEY, years list<int>);
Create an SAI index using the collection's years column.
CREATE CUSTOM INDEX ON calendar(years) USING 'StorageAttachedIndex';
Insert some random int list data for years, just for demo purposes.
Tip: Notice the INSERT command's square brackets syntax for list values.
INSERT INTO calendar (key, years) VALUES (0, [1990,1996]);
INSERT INTO calendar (key, years) VALUES (1, [2000,2010]);
INSERT INTO calendar (key, years) VALUES (2, [2001,1990]);
Query with CONTAINS example:
SELECT * FROM calendar WHERE years CONTAINS 1990;
 key | years
-----+--------------
   0 | [1990, 1996]
   2 | [2001, 1990]
(2 rows)

This example created the calendar table with years list<int>. Of course, you could have created the table with years list<text>, for example, inserted 'string' values, and queried on the strings.

Using the set type

Now create a calendar2 table with a collection of type set.
CREATE TABLE calendar2 (key int PRIMARY KEY, years set<int>);
Create an SAI index using the collection's years column – this time for the calendar2 table.
CREATE CUSTOM INDEX ON calendar2(years) USING 'StorageAttachedIndex';
Insert some random int set data for years, again just for demo purposes.
Tip: Notice the INSERT command's curly braces syntax for set values.
INSERT INTO calendar2 (key, years) VALUES (0, {1990,1996});
INSERT INTO calendar2 (key, years) VALUES (1, {2000,2010});
INSERT INTO calendar2 (key, years) VALUES (2, {2001,1990,2020});
Query with CONTAINS example from the list:
SELECT * FROM calendar2  WHERE years CONTAINS 1990;
 key | years
-----+--------------------
   0 |       {1990, 1996}
   2 | {1990, 2001, 2020}

(2 rows)
Tip: In CQL queries of database tables with SAI indexes, the CONTAINS clauses are supported with, and specific to:
  • SAI collection maps with keys, values, and entries
  • SAI collections with list and set types
.