Examine SAI column index and query rules
Understand the columns on which you can define SAI indexes and run queries.
Let's walk through another example to understand the columns on which you can define SAI indexes and subsequently run queries that reference those columns.
While this guide's prior topic, SAI quick start, focused on defining multiple indexes based on non-primary key columns (a very useful feature), this topic expands the discussion to SAI indexes on each column in a composite partition key. Then this topic adds an example using collection maps.
About composite partition keys
SAI makes it possible to define an index using a single column in the table's composite partition key. A composite partition key means that the partition is made up of multiple columns. In this case with an SAI index, you would specify just one of the columns that comprises the composite partition key, as shown in this topic. You could also define a separate SAI index that specifies another single column from the composite partition key.
Without an SAI index defined, in your query, the WHERE
clause would need
to specify all the partition key columns or use ALLOW FILTERING
. This type
of query can result in unpredictable performance. SAI solves this problem – see the examples
below.
Create the keyspace and table, and insert simple data
demo2
keyspace:
CREATE KEYSPACE demo2 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
id
and age
. We also define a name
column cluster key:
CREATE TABLE demo2.person_id_name_compositkey2 (id int, age int, name text, primary key ((id, age), name));
INSERT INTO demo2.person_id_name_compositkey2 (id, name, age) VALUES (1, 'Rocco', 21); INSERT INTO demo2.person_id_name_compositkey2 (id, name, age) VALUES (2, 'John', 50); INSERT INTO demo2.person_id_name_compositkey2 (id, name, age) VALUES (3, 'Bryn', 33); INSERT INTO demo2.person_id_name_compositkey2 (id, name, age) VALUES (4, 'Jason', 34);
Queries before SAI indexes are defined
SELECT * from demo2.person_id_name_compositkey2 where id > 0;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"
Define SAI indexes
ALLOW FILTERING
in your queries, now define SAI indexes.
Examples:
CREATE CUSTOM INDEX ON demo2.person_id_name_compositkey2 (id) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON demo2.person_id_name_compositkey2 (age) USING 'StorageAttachedIndex';
Run the example query again
SELECT * from demo2.person_id_name_compositkey2 where id > 0;
id | age | name ----+-----+------- 3 | 33 | Bryn 1 | 21 | Rocco 2 | 50 | John 4 | 34 | Jason (4 rows)(Try another query that specifies the composite partition's
age
column in the
query.)What about SAI and a partition key with only one column?
CREATE CUSTOM INDEX ON demo2.person_id_name_primarykey (id) USING 'StorageAttachedIndex';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot create secondary index on the only partition key column id"
Using collections with SAI
map
, list
, and
set
. Using these features with SAI indexes was introduced in DSE 6.8.3.
Collections allow you to group and store data together in a column. In a relational database, a grouping such as a user's multiple email addresses is achieved
via many-to-one joined relationship between (for example) a user
table and
an email
table. DSE avoids joins between two tables by storing the user's
email addresses in a collection column in the user
table. Each collection
specifies the data type of the data held.
A collection is appropriate if the data for collection storage is limited. If the data has unbounded growth potential, like messages sent or sensor events registered every second, do not use collections. Instead, use a table with a compound primary key where data is stored in the clustering columns.
- SAI collection map examples with keys, values, and entries
- SAI collection examples with list and set types
CONTAINS
clauses are supported with, and specific to: - SAI collection maps with
keys
,values
, andentries
- SAI collections with
list
andset
types
SAI collection map examples with keys, values, and entries
The following examples demonstrate using collection maps of multiple types (keys
, values
, entries
) in SAI indexes.
For related information, see and .
USE
command.CREATE KEYSPACE demo3 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;
audit
table, with a collection map named
text_map
.
CREATE TABLE audit ( id int PRIMARY KEY , text_map map<text, text>);
keys
,
values
, and entries
. DROP INDEX
index-name;
command before adding the next map type on the
same column.
CREATE CUSTOM INDEX ON audit (keys(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (values(text_map)) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON audit (entries(text_map)) USING 'StorageAttachedIndex';
INSERT INTO audit (id, text_map) values (1, {'Carlos':'Perotti', 'Marcel':'Silva'}); INSERT INTO audit (id, text_map) values (2, {'Giovani':'Pasi', 'Frances':'Giardello'}); INSERT INTO audit (id, text_map) values (3, {'Mark':'Pastore', 'Irene':'Cantona'});
SELECT * FROM audit;
id | text_map ----+--------------------------------------------- 1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'} 2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'} 3 | {'Irene': 'Cantona', 'Mark': 'Pastore'} (3 rows)
CONTAINS KEY
that uses the SAI with
collection map
type
keys
:SELECT * FROM audit WHERE text_map CONTAINS KEY 'Giovani';
id | text_map ----+--------------------------------------------- 2 | {'Frances': 'Giardello', 'Giovani': 'Pasi'} (1 rows)
map
type
entries
:SELECT * FROM audit WHERE text_map['Irene'] = 'Cantona' AND text_map['Mark'] = 'Pastore';
id | text_map ----+----------------------------------------- 3 | {'Irene': 'Cantona', 'Mark': 'Pastore'} (1 rows)
CONTAINS
that uses the SAI index with map type
values
:SELECT * FROM audit WHERE text_map CONTAINS 'Silva';
id | text_map ----+------------------------------------------ 1 | {'Carlos': 'Perotti', 'Marcel': 'Silva'} (1 rows)
SAI collection examples with list and set types
list
and
set
types in SAI indexes. For related information, see: USE
command.CREATE KEYSPACE IF NOT EXISTS demo3 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
USE demo3;
Using the list type
calendar
table with a collection of type
list
.
CREATE TABLE calendar (key int PRIMARY KEY, years list<int>);
years
column.
CREATE CUSTOM INDEX ON calendar(years) USING 'StorageAttachedIndex';
int
list data for years
, just for demo
purposes. INSERT
command's square brackets
syntax for list
values.INSERT INTO calendar (key, years) VALUES (0, [1990,1996]); INSERT INTO calendar (key, years) VALUES (1, [2000,2010]); INSERT INTO calendar (key, years) VALUES (2, [2001,1990]);
CONTAINS
example from the list:
SELECT * FROM calendar WHERE years CONTAINS 1990;
key | years -----+-------------- 0 | [1990, 1996] 2 | [2001, 1990] (2 rows)
This example created the calendar
table with years list<int>
. Of
course, you could have created the table with years list<text>
, for
example, inserted 'string' values, and queried on the strings.
Using the set type
calendar2
table with a collection of type
set
.
CREATE TABLE calendar2 (key int PRIMARY KEY, years set<int>);
years
column – this time for
the calendar2
table.
CREATE CUSTOM INDEX ON calendar2(years) USING 'StorageAttachedIndex';
int
set data for years
, again just for
demo purposes. INSERT
command's curly braces
syntax for set
values.INSERT INTO calendar2 (key, years) VALUES (0, {1990,1996}); INSERT INTO calendar2 (key, years) VALUES (1, {2000,2010}); INSERT INTO calendar2 (key, years) VALUES (2, {2001,1990,2020});
CONTAINS
example from the list:
SELECT * FROM calendar2 WHERE years CONTAINS 1990;
key | years -----+-------------------- 0 | {1990, 1996} 2 | {1990, 2001, 2020} (2 rows)
What's next?
The next topic looks at configuration options for SAI indexes.