Query with SAI

This page demonstrates how to run queries on tables with SAI indexes.

To query with SAI, you need a keyspace with a table and one or more SAI indexes.

If you haven’t created an SAI index yet, see the SAI quickstart to learn how to create multiple indexes based on non-primary key columns.

Query operators supported for SAI

Query operators for tables with SAI indexes:

The OR operator is only supported in Astra DB Serverless (vector) databases.

Numerics: =, <, >, <=, >=, AND , OR
Strings: =, AND , OR
Collections: =, CONTAINS, CONTAINS KEY

Vector search with SAI

For an example and explanation of vector search with CQL, see Vector search quickstart.

Single index match on a column

This example uses the following table and indexes:

CREATE TABLE IF NOT EXISTS cycling.comments_vs (
  record_id timeuuid,
  id uuid,
  commenter text,
  comment text,
  comment_vector VECTOR <FLOAT, 5>,
  created_at timestamp,
  PRIMARY KEY (id, created_at)
)
WITH CLUSTERING ORDER BY (created_at DESC);
CREATE CUSTOM INDEX commenter_idx ON cycling.comments_vs (commenter)
  USING 'StorageAttachedIndex' 
  WITH OPTIONS = {'case_sensitive': 'true', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX created_at_idx ON cycling.comments_vs (created_at) 
  USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX comment_ann_idx ON cycling.comments_vs(comment_vector) 
  USING 'StorageAttachedIndex';

The column commenter is not the partition key in this table, so an index is required to query on it.

Query for a match on that column:

SELECT * FROM cycling.comments_vs WHERE commenter = 'Alex';

Results

 id                                   | created_at                      | comment                                | comment_vector                                                  | commenter | record_id
--------------------------------------+---------------------------------+----------------------------------------+-----------------------------------------------------------------+-----------+--------------------------------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-04-01 14:33:02.160000+0000 | LATE RIDERS SHOULD NOT DELAY THE START |               b'?fff?\\n=q=\\xf5\\xc2\\x8f=\\xcc\\xcc\\xcd?s33' |      Alex | f259bc00-38c3-11ef-bd85-f92c3c7170c3
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-03-21 21:11:09.999000+0000 |      Second rest stop was out of water | b'?}p\\xa4?\\x00\\x00\\x00?}p\\xa4=\\xcc\\xcc\\xcd>\\xae\\x14{' |      Alex | f2583561-38c3-11ef-bd85-f92c3c7170c3
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-02-14 20:43:20.000000+0000 | Raining too hard should have postponed |       b'>\\xe6ff=\\xb8Q\\xec<#\\xd7\\n>L\\xcc\\xcd=\\xe1G\\xae' |      Alex | f2557640-38c3-11ef-bd85-f92c3c7170c3

(3 rows)

Note that a failure will occur if you try this query before creating the index:

SELECT * FROM cycling.comments_vs WHERE commenter = 'Alex';

Results

 id                                   | created_at                      | comment                                | comment_vector                                                  | commenter | record_id
--------------------------------------+---------------------------------+----------------------------------------+-----------------------------------------------------------------+-----------+--------------------------------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-04-01 14:33:02.160000+0000 | LATE RIDERS SHOULD NOT DELAY THE START |               b'?fff?\\n=q=\\xf5\\xc2\\x8f=\\xcc\\xcc\\xcd?s33' |      Alex | f259bc00-38c3-11ef-bd85-f92c3c7170c3
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-03-21 21:11:09.999000+0000 |      Second rest stop was out of water | b'?}p\\xa4?\\x00\\x00\\x00?}p\\xa4=\\xcc\\xcc\\xcd>\\xae\\x14{' |      Alex | f2583561-38c3-11ef-bd85-f92c3c7170c3
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-02-14 20:43:20.000000+0000 | Raining too hard should have postponed |       b'>\\xe6ff=\\xb8Q\\xec<#\\xd7\\n>L\\xcc\\xcd=\\xe1G\\xae' |      Alex | f2557640-38c3-11ef-bd85-f92c3c7170c3

(3 rows)

Single index match on a column with options

SAI options include case_sensitive, ascii, and normalize.

This example uses the following table and indexes:

CREATE TABLE IF NOT EXISTS cycling.comments_vs (
  record_id timeuuid,
  id uuid,
  commenter text,
  comment text,
  comment_vector VECTOR <FLOAT, 5>,
  created_at timestamp,
  PRIMARY KEY (id, created_at)
)
WITH CLUSTERING ORDER BY (created_at DESC);
CREATE CUSTOM INDEX commenter_idx ON cycling.comments_vs (commenter)
  USING 'StorageAttachedIndex' 
  WITH OPTIONS = {'case_sensitive': 'true', 'normalize': 'true', 'ascii': 'true'};

The column commenter is not the partition key in this table, so an index is required to query on it. If we want to check commenter as a case-sensitive value, we can use the case_sensitive option set to true.

Note that no results are returned if you use an inappropriately case-sensitive value in the query:

SELECT comment,comment_vector,commenter FROM cycling.comments_vs 
  WHERE commenter : 'alex';

Results

 comment               | comment_vector                                        | commenter
-----------------------+-------------------------------------------------------+-----------
 Rest stops were great | b'?}p\xa4?\x00\x00\x00?}p\xa4=\xcc\xcc\xcd>\xae\x14{' |      alex

(1 rows)

When we switch the case of the cyclist’s name to match the case in the index, the query succeeds:

SELECT comment,comment_vector,commenter FROM cycling.comments_vs 
  WHERE commenter : 'Alex';

Results

 comment                                | comment_vector                                        | commenter
----------------------------------------+-------------------------------------------------------+-----------
 LATE RIDERS SHOULD NOT DELAY THE START |            b'?fff?\n=q=\xf5\xc2\x8f=\xcc\xcc\xcd?s33' |      Alex
      Second rest stop was out of water | b'?}p\xa4?\x00\x00\x00?}p\xa4=\xcc\xcc\xcd>\xae\x14{' |      Alex
 Raining too hard should have postponed |      b'>\xe6ff=\xb8Q\xec<#\xd7\n>L\xcc\xcd=\xe1G\xae' |      Alex
      LATE RIDERS SHOULD NOT BE ALLOWED |            b'?fff?\n=q=\xf5\xc2\x8f=\xcc\xcc\xcd?s33' | alexander

(4 rows)

Multiple index matching with operators

These examples use the following table and indexes. Several indexes are created for the table to demonstrate how to query for matches on more than one column.

CREATE TABLE IF NOT EXISTS cycling.comments_vs (
  record_id timeuuid,
  id uuid,
  commenter text,
  comment text,
  comment_vector VECTOR <FLOAT, 5>,
  created_at timestamp,
  PRIMARY KEY (id, created_at)
)
WITH CLUSTERING ORDER BY (created_at DESC);
CREATE CUSTOM INDEX commenter_idx ON cycling.comments_vs (commenter)
  USING 'StorageAttachedIndex' 
  WITH OPTIONS = {'case_sensitive': 'true', 'normalize': 'true', 'ascii': 'true'};
CREATE CUSTOM INDEX created_at_idx ON cycling.comments_vs (created_at) 
  USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX comment_ann_idx ON cycling.comments_vs(comment_vector) 
  USING 'StorageAttachedIndex';

AND: Query for matches on more than one column where both columns must match:

SELECT * FROM cycling.comments_vs 
  WHERE created_at = '2017-03-21 21:11:09.999' 
  AND commenter : 'Alex';

Results

 id                                   | created_at                      | comment                           | comment_vector                                                  | commenter | record_id
--------------------------------------+---------------------------------+-----------------------------------+-----------------------------------------------------------------+-----------+--------------------------------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-03-21 21:11:09.999000+0000 | Second rest stop was out of water | b'?}p\\xa4?\\x00\\x00\\x00?}p\\xa4=\\xcc\\xcc\\xcd>\\xae\\x14{' |      Alex | f2583561-38c3-11ef-bd85-f92c3c7170c3

(1 rows)

IN: Query for matches from a list of values:

SELECT * FROM cycling.comments_vs 
  WHERE created_at IN ('2017-03-21 21:11:09.999','2017-03-22 01:16:59.001');

Results

 id                                   | created_at                      | comment                           | comment_vector                                                                  | commenter | record_id
--------------------------------------+---------------------------------+-----------------------------------+---------------------------------------------------------------------------------+-----------+--------------------------------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-03-21 21:11:09.999000+0000 | Second rest stop was out of water |                 b'?}p\\xa4?\\x00\\x00\\x00?}p\\xa4=\\xcc\\xcc\\xcd>\\xae\\x14{' |      Alex | f2583561-38c3-11ef-bd85-f92c3c7170c3
 c7fceba0-c141-4207-9494-a29f9809de6f | 2017-03-22 01:16:59.001000+0000 |     Great snacks at all reststops | b'=\\xcc\\xcc\\xcd>\\xcc\\xcc\\xcd=\\xcc\\xcc\\xcd?\\x05\\x1e\\xb8=\\xb8Q\\xec' |       Amy | f25d3e71-38c3-11ef-bd85-f92c3c7170c3

(2 rows)

OR: Query for a match on either one column or the other:

SELECT * FROM cycling.comments_vs 
  WHERE created_at='2017-03-21 21:11:09.999' 
  OR created_at='2017-03-22 01:16:59.001';

Results

 id                                   | created_at                      | comment                           | comment_vector                                                                  | commenter | record_id
--------------------------------------+---------------------------------+-----------------------------------+---------------------------------------------------------------------------------+-----------+--------------------------------------
 e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-03-21 21:11:09.999000+0000 | Second rest stop was out of water |                 b'?}p\\xa4?\\x00\\x00\\x00?}p\\xa4=\\xcc\\xcc\\xcd>\\xae\\x14{' |      Alex | f2583561-38c3-11ef-bd85-f92c3c7170c3
 c7fceba0-c141-4207-9494-a29f9809de6f | 2017-03-22 01:16:59.001000+0000 |     Great snacks at all reststops | b'=\\xcc\\xcc\\xcd>\\xcc\\xcc\\xcd=\\xcc\\xcc\\xcd?\\x05\\x1e\\xb8=\\xb8Q\\xec' |       Amy | f25d3e71-38c3-11ef-bd85-f92c3c7170c3

(2 rows)

Query with SAI

Query operators supported for SAI

Vector search with SAI

Single index match on a column

Single index match on a column with options

Case-sensitivity

Multiple index matching with operators

See also

Was this helpful?

Give Feedback