• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

CQL for Astra DB Serverless

  • Overview
    • Cassandra structure
    • About CQL
    • CQL quick reference
    • Known issues
  • Getting started
    • Vector Search Quickstart
    • CQL Quickstart
    • SAI Quickstart
  • Planning
    • Data modeling
      • Data modeling concepts
      • Data modeling analysis
  • Connecting
    • cqlsh (startup options)
      • Accessing data using CQL
        • Starting the standalone cqlsh tool
        • Starting cqlsh
    • Developing
      • Introduction to CQL
      • Managing keyspaces
        • Keyspace concepts
        • Check a keyspace
      • Managing tables
        • Table concepts
        • Data types
        • Create a table
        • Check a table
        • Alter a table
        • Truncate a table
        • Drop a table
      • Managing columns
        • Managing columns
        • Create a table column
        • Check column existence
        • Alter a table column
        • Drop a column
        • User-defined type (UDT) overview
          • Create a user-defined type (UDT)
          • Check for user-defined type (UDT) existence
          • Alter a user-defined type (UDT)
          • Drop a user-defined type (UDT)
      • Indexing Tables
        • Indexing concepts
        • Storage-Attached Indexing (SAI) Overview
          • SAI concepts
          • SAI Quickstart
          • SAI FAQ
          • Create SAI index
          • Alter SAI index
          • Drop SAI index
          • Querying with SAI
          • SAI write path and read path
        • Secondary indexes (2i) overview
          • Secondary index (2i) concepts
          • When to use an index
          • Building and maintaining secondary indexes (2i)
          • Create a secondary index (2i)
            • Create a 2i on a collection column
          • Check secondary index (2i) existence
          • Alter a secondary index (2i)
          • Drop a secondary index (2i)
      • Inserting and updating data
        • Insert simple data
        • Set column
        • List fields
        • Map fields
        • Tuple
        • User-defined type (UDT) column
        • Insert static data
        • UUID and timeuuid column
        • Date column
        • Time column
        • Timestamp column
        • Duration column
        • Update counter
        • Lightweight transactions (LWTs)
        • Inserting JSON formatted values
        • Batching inserts and updates
          • Batching inserts, updates, and deletes
          • Good use of BATCH statement
          • Misuse of BATCH statement
        • Importing data from a CSV file
        • Setting the time-to-live (TTL) for value
          • Inserting, updating, and deleting data
      • Deleting data
        • Deleting values from a column or entire row
        • Expiring data with time-to-live
        • Expiring data with TTL example
      • Querying and retrieving data
        • Using a SELECT statement
        • Restricting queries using WHERE clauses
          • Partition keys
          • Clustering columns
          • Retrieval using the IN keyword
          • Sorting and limiting results
        • Returning data from collection columns
        • Aggregating results
        • Using user-defined functions (UDF)
        • Applying user-defined aggregate (UDA) functions
          • Testing performance impact using tracing
        • Formatting query results as JSON
        • Retrieving the write timestamp
        • Using tracing
          • Tracing performance
        • DSE Search index querying
        • Search index filtering best practices
          • Search index syntax
          • search-index:si-query-syntax.adoc#siIndexedColumns
          • search-index:si-query-syntax.adoc#basicSolrSyntax
          • search-index:si-query-syntax.adoc#queriesJSON
          • search-index:si-query-syntax.adoc#siEscapeCharacters
        • Native CQL search queries
        • Identifying the partition key
        • Filtering on terms
          • Filtering on words, phrases, or substrings
          • Fuzzy search
        • UDT query examples
        • Querying a CQL collection set
        • Using date ranges in solr_query
        • Geospatial queries for Point and LineString
        • Spatial queries with polygons require JTS
        • Using dynamic fields
        • Limiting results and paging
        • Restricted query routing
    • Securing overview
      • Security quickstart
      • Using cqlsh with authentication
      • System tables for authentication and authorization
  • Cycling Examples
    • Setting up the cycling keyspace
    • User-defined types (UDTs)
      • basic_info
      • basic_info_withTTL
      • fullname
      • race
    • User-defined functions (UDFs) and user-defined aggregates (UDAs)
      • flog
      • left
      • average_state
      • average_final
      • Cycling user-defined aggregate team_average
    • Tables
      • birthday_list
      • calendar
      • comments
      • country_flag
      • cyclist_alt_stats
      • cyclist_base
      • cyclist_career_teams
      • cyclist_category
      • cyclist_emails
      • cyclist_expenses
      • cyclist_id
      • cyclist_name
      • cyclist_points
      • cyclist_races
      • cyclist_sponsors
      • cyclist_stats
      • cyclist_teams
      • events
      • popular_count
      • race_sponsors
      • race_starts
      • race_times
      • race_winners
      • rank_by_year_and_name
      • team_average
      • upcoming_calendar
    • Materialized views (MVs)
      • cyclist_by_age-mv
    • Search indexes
      • Search index examples
      • comments search index
    • Access control
      • Cycling internal
    • Cycling queries
      • Aggregating cycling data
    • Reference
      • CQL quick reference
      • Understanding the CQL command syntax
      • Data types
        • Creating a blob column
        • Date, time, and timestamp format
      • Operators
      • CQL native functions
      • CQL native aggregates
      • System and virtual tables
        • System tables
          • Keyspace tables
          • Querying system tables
          • Functions, aggregates, and user types
        • Virtual tables
          • Virtual keyspaces and tables
          • Virtual tables for SAI indexes and SSTables
      • cqlsh (startup options)
        • Configuring a cqlshrc file
        • Starting cqlsh on a search node
      • CQL shell (cqlsh) reference
        • CAPTURE
        • CLEAR
        • CONSISTENCY
        • COPY TO
        • COPY FROM
        • DESCRIBE AGGREGATE
        • DESCRIBE CLUSTER
        • DESCRIBE SCHEMA
        • DESCRIBE FUNCTION
        • DESCRIBE INDEX
        • DESCRIBE KEYSPACE
        • DESCRIBE MATERIALIZED VIEW
        • DESCRIBE SEARCH INDEX
        • DESCRIBE TABLE
        • DESCRIBE TYPE
        • EXECUTE AS
        • EXPAND
        • EXIT
        • LOGIN
        • PAGING
        • SERIAL CONSISTENCY
        • SHOW
        • SOURCE
        • TIMING
        • TRACING
      • CQL commands
        • ALTER TABLE
        • ALTER TYPE
        • ALTER USER (Deprecated)
        • BATCH
        • CREATE AGGREGATE
        • CREATE CUSTOM INDEX
        • CREATE INDEX
        • CREATE TABLE
        • CREATE TYPE
        • CREATE USER (Deprecated)
        • DELETE
        • DROP INDEX
        • DROP TABLE
        • DROP TYPE
        • DROP USER (Deprecated)
        • GRANT
        • INSERT
        • RESTRICT
        • REVOKE
        • SELECT
        • TRUNCATE
        • UNRESTRICT
        • UNRESTRICT ROWS
        • UPDATE
        • USE
  • CQL for Astra DB Serverless
  • Reference
  • CQL commands
  • CREATE CUSTOM INDEX
Edit this Page

CREATE CUSTOM INDEX

Supported databases:

  • DataStax Astra DB serverless and DataStax Astra DB classic

  • DataStax Enterprise (DSE) 6.8.3 and later

Creates a Storage-Attached Indexing (SAI) index. You can create multiple secondary indexes on the same database table, with each SAI index based on any column in the table. All column date types except the following are supported for SAI indexes:

  • counter

  • geospatial types: PointType, LineStringType, PolygonType

  • non-frozen user-defined type (UDT)

One exception

You cannot define an SAI index based on the partition key when it’s comprised of only one column. If you attempt to create an SAI index in this case, SAI issues an error message.

However, you can define an SAI index on one of the columns in a table’s composite partition key, i.e., a partition key comprised of multiple columns. If you need to query based on one of those columns, an SAI index is a helpful option. In fact, you can define an SAI index on each column in a composite partition key, if needed.

Defining one or more SAI indexes based on any column in a database table (with the rules noted above) subsequently gives you the ability to run performant queries that use the indexed column to filter results.

See the SAI section.

Synopsis

CREATE CUSTOM INDEX [ IF NOT EXISTS ] [ <index_name> ]
  ON [ <keyspace_name>.]<table_name> (<column_name>)
    | [ (KEYS(<map_name>)) ]
    | [ (VALUES(<map_name>)) ]
    | [ (ENTRIES(<map_name>)) ]
  USING 'StorageAttachedIndex'
  [ WITH OPTIONS = { <option_map> } ] ;
Syntax legend
Table 1. Legend
Syntax conventions Description

UPPERCASE

Literal keyword.

Lowercase

Not literal.

< >

Variable value. Replace with a user-defined value.

[]

Optional. Square brackets ([]) surround optional command arguments. Do not type the square brackets.

( )

Group. Parentheses ( ( ) ) identify a group to choose from. Do not type the parentheses.

|

Or. A vertical bar (|) separates alternative elements. Type any one of the elements. Do not type the vertical bar.

...

Repeatable. An ellipsis ( ... ) indicates that you can repeat the syntax element as often as required.

'<Literal string>'

Single quotation (') marks must surround literal strings in CQL statements. Use single quotation marks to preserve upper case.

{ <key> : <value> }

Map collection. Braces ({ }) enclose map collections or key value pairs. A colon separates the key and the value.

<datatype2

Set, list, map, or tuple. Angle brackets ( < > ) enclose data types in a set, list, map, or tuple. Separate the data types with a comma.

<cql_statement>;

End CQL statement. A semicolon (;) terminates all CQL statements.

[--]

Separate the command line options from the command arguments with two hyphens ( -- ). This syntax is useful when arguments might be mistaken for command line options.

' <<schema\> ... </schema\>> '

Search CQL only: Single quotation marks (') surround an entire XML schema declaration.

@<xml_entity>='<xml_entity_type>'

Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrConfig files.

index_name

Optional identifier for index. If no name is specified, the default is used, <table_name>_<column_name>_idx. Enclose in quotes to use special characters or to preserve capitalization.

Index names are unique per keyspace. The index name must be a unique identifier for the index for each table within a keyspace. This requirement is true for both vector and non-vector indexes.

column_name

The name of the table column on which the SAI index is being defined. SAI allows only alphanumeric characters and underscores in names. SAI returns InvalidRequestException if you try to define an index on a column name that contains other characters, and does not create the index.

map_name

Used with collections, identifier of the map_name specified in CREATE TABLE …​ map(<map_name>). The regular column syntax applies for collection types list and set.

option_map

Define options in JSON simple format.

Option Description

case_sensitive

Ignore case in matching string values. Default: true.

normalize

When set to true, perform Unicode normalization on indexed strings. SAI supports Normalization Form C (NFC) Unicode. When set to true, SAI normalizes the different versions of a given Unicode character to a single version, retaining all the marks and symbols in the index. For example, SAI would change the character Å (U+212B) to Å (U+00C5).

When implementations keep strings in a normalized form, equivalent strings have a unique binary representation. See Unicode Standard Annex #15, Unicode Normalization Forms.

Default: false.

ascii

When set to true, SAI converts alphabetic, numeric, and symbolic characters that are not in the Basic Latin Unicode block (the first 127 ASCII characters) to the ASCII equivalent, if one exists. For example, this option changes à to a. Default: false.

similarity_function

Vector search relies on computing the similarity or distance between vectors to identify relevant matches. The similarity function is used to compute the similarity between two vectors. Valid options are: EUCLIDEAN, DOT_PRODUCT, COSINE Default: COSINE

Query operators

SAI supports the following query operators for tables with SAI indexes:

  • Numerics: =, <, >, ⇐, >=, AND, IN, OR (for vector databases)

  • Strings: = CONTAINS, CONTAINS KEY, AND, IN

SAI does not supports the following query operators for tables with SAI indexes:

  • Strings or Numerics: LIKE, OR (for non-vector databases)

Examples

These examples define SAI indexes for the cycling.cyclist_semi_pro table, which is demonstrated in the SAI quickstart.

CREATE CUSTOM INDEX lastname_sai_idx ON cycling.cyclist_semi_pro (lastname) 
USING 'StorageAttachedIndex' 
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'}; 

CREATE CUSTOM INDEX age_sai_idx ON cycling.cyclist_semi_pro (age) 
USING 'StorageAttachedIndex';

CREATE CUSTOM INDEX country_sai_idx ON cycling.cyclist_semi_pro (country) 
USING 'StorageAttachedIndex'
WITH OPTIONS = {'case_sensitive': 'false', 'normalize': 'true', 'ascii': 'true'}; 

CREATE CUSTOM INDEX registration_sai_idx ON cycling.cyclist_semi_pro (registration) 
USING 'StorageAttachedIndex'; 

For sample queries that find data in cycling.cyclist_semi_pro via these sample SAI indexes, see Submit CQL queries. Also refer Examine SAI column index and query rules.

SAI collection map examples with keys, values, and entries

The following examples demonstrate using collection maps of multiple types (keys, values, entries) in SAI indexes. For related information, see Creating collections and Using map type.

Also refer to the SAI collection examples of type list and set in this topic.

If you are logged into the DataStax Astra CQL Console, you already have a keyspace defined for your cloud database; skip this next step and specify your keyspace name with the USE command.

Next, use the keyspace:

USE demo3;

Create an audit table, with a collection map named text_map:

CREATE TABLE audit ( id int PRIMARY KEY , text_map map<text, text>);

Create multiple SAI indexes on the same map column, each using KEYS, VALUES, and ENTRIES.

CREATE CUSTOM INDEX ON audit (KEYS(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (VALUES(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (ENTRIES(text_map)) USING 'StorageAttachedIndex';

Insert some data:

INSERT INTO audit (id, text_map) values (1, {'Carlos':'Perotti', 'Marcel':'Silva'});
INSERT INTO audit (id, text_map) values (2, {'Giovani':'Pasi', 'Frances':'Giardello'});
INSERT INTO audit (id, text_map) values (3, {'Mark':'Pastore', 'Irene':'Cantona'});

Query all data:

  • Query

  • Result

SELECT * FROM audit;
 id | text_map
----+---------------------------------------------
  1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}
  2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}
  3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}

(3 rows)

Query using the SAI index for specific entries in the map column:

  • Query

  • Result

SELECT * FROM audit WHERE text_map['Irene'] = 'Cantona' AND text_map['Mark'] = 'Pastore';
 id | text_map
----+-----------------------------------------
  3 | {'Irene': 'Cantona', 'Mark': 'Pastore'}

(1 rows)

Query using the SAI index for specific keys in the map column using CONTAINS KEY:

  • Query

  • Result

SELECT * FROM audit WHERE text_map CONTAINS KEY 'Giovani';
 id | text_map
----+---------------------------------------------
  2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'}

(1 rows)

Query using the SAI index for specific values in the map column with CONTAINS:

  • Query

  • Result

SELECT * FROM audit WHERE text_map CONTAINS 'Silva';
 id | text_map
----+------------------------------------------
  1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'}

(1 rows)

Remember that in CQL queries using SAI indexes, the CONTAINS clauses are supported with, and specific to:

  • SAI collection maps with keys, values, and entries

  • SAI collections with list and set types

SAI collection examples with list and set types

These examples demonstrate using collections with the list and set types in SAI indexes. For related information, see:

  • Creating collections

  • Using list type

  • Using set type

If you have not already, create the keyspace.

CREATE KEYSPACE IF NOT EXISTS demo3 WITH REPLICATION =
       {'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;

Using the list type

Create a calendar table with a collection of type list.

CREATE TABLE calendar (key int PRIMARY KEY, years list<int>);

Create an SAI index using the collection’s years column.

CREATE CUSTOM INDEX ON calendar(years) USING 'StorageAttachedIndex';

Insert some random int list data for years, just for demo purposes.

Notice the INSERT command’s square brackets syntax for list values.

INSERT INTO calendar (key, years) VALUES (0, [1990,1996]);
INSERT INTO calendar (key, years) VALUES (1, [2000,2010]);
INSERT INTO calendar (key, years) VALUES (2, [2001,1990]);

Query with CONTAINS example:

  • Query

  • Result

SELECT * FROM calendar WHERE years CONTAINS 1990;
 key | years
-----+--------------
   0 | [1990, 1996]
   2 | [2001, 1990]
(2 rows)

This example created the calendar table with years list<int>. Of course, you could have created the table with years list<text>, for example, inserted 'string' values, and queried on the strings.

Using the set type

Now create a calendar2 table with a collection of type set.

CREATE TABLE calendar2 (key int PRIMARY KEY, years set<int>);

Create an SAI index using the collection’s years column — this time for the calendar2 table.

CREATE CUSTOM INDEX ON calendar2(years) USING 'StorageAttachedIndex';

Insert some random int set data for years, again just for demo purposes.

Notice the INSERT command’s curly braces syntax for set values.

INSERT INTO calendar2 (key, years) VALUES (0, {1990,1996});
INSERT INTO calendar2 (key, years) VALUES (1, {2000,2010});
INSERT INTO calendar2 (key, years) VALUES (2, {2001,1990,2020});

Query with CONTAINS example from the list:

  • Query

  • Result

SELECT * FROM calendar2  WHERE years CONTAINS 1990;
 key | years
-----+--------------------
   0 |       {1990, 1996}
   2 | {1990, 2001, 2020}

(2 rows)

Remember that in CQL queries using SAI indexes, the CONTAINS clauses are supported with, and specific to:

  • SAI collection maps with keys, values, and entries

  • SAI collections with list and set types

CREATE AGGREGATE CREATE INDEX

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage