Using SAI indexes

Understanding the columns on which you can run queries with SAI indexes.

Let's walk through another example to understand the columns on which you can run queries, and when to add Storage-Attached Indexing (SAI) indexes.

Create the keyspace and table, and insert simple data

Consider the following DDL setup with a primary key composed of an id column partition key, and a name column cluster key:
CREATE KEYSPACE demo2 WITH replication = {'class': 'NetworkTopologyStrategy' , 
  'Cassandra': '1'}; 
CREATE TABLE demo2.person_id_name_primarykey (id int, name text, age int, 
  PRIMARY KEY (id, name));
INSERT INTO demo2.person_id_name_primarykey (id, name, age) VALUES (1, 'John', 21); 
INSERT INTO demo2.person_id_name_primarykey (id, name, age) VALUES (2, 'john', 50); 
INSERT INTO demo2.person_id_name_primarykey (id, name, age) VALUES (3, 'Boris', 43); 
INSERT INTO demo2.person_id_name_primarykey (id, name, age) VALUES (4, 'Caleb', 34); 
Important: Notice that at this point, there is no SAI index defined.

Queries using partition key only

Can you query the table based on the partition key only? Yes:
SELECT * from demo2.person_id_name_primarykey where id = 1 ;
 id | name | age
----+------+-----
  1 | John |  21

Queries using entire partition key

Can you query the table based on the entire primary key (meaning, the partition and clustering keys)? Yes:
SELECT * from demo2.person_id_name_primarykey where id = 1 and name = 'John';
 id | name | age
----+------+-----
  1 | John |  21

Queries using clustering key only?

Can you query using the clustering key alone? No – not possible without an index or ALLOW FILTERING.
SELECT * from demo2.person_id_name_primarykey where name = 'John';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute this 
query as it might involve data filtering and thus may have unpredictable performance. If 
you want to execute this query despite the performance unpredictability, use ALLOW FILTERING"

Add an SAI index, and query again

If we now add an SAI index on the clustering column, we can filter results using this column:
CREATE CUSTOM INDEX ON demo2.person_id_name_primarykey (name) USING 'StorageAttachedIndex';
SELECT * from demo2.person_id_name_primarykey where name = 'John';
 id | name | age
----+------+-----
  1 | John |  21

What about creating an SAI index on the table's partitition key?

As shown above, you can query on clustering key columns. However, there is no need to index partition key columns, or to create an SAI index for the partition key column (id, in this example). If attempted, SAI returns an error:
CREATE CUSTOM INDEX ON demo2.person_id_name_primarykey (id) USING 'StorageAttachedIndex';
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot create 
secondary index on the only partition key column id"