Using a secondary index

Create a secondary index on a column.

Certain queries return an error. For example, you might want to see the cyclist rankings for a particular year. The rank_by_year_and_name table has a composite partition key with two columns, race_year and race_name; this query fails because only the race_year column is specified in the WHERE clause:

SELECT *
FROM cycling.rank_by_year_and_name
WHERE race_year = 2015;

InvalidRequest: Error from server: code=2200 [Invalid query] message=
"Cannot execute this query as it might involve data filtering and
thus may have unpredictable performance. If you want to execute this
query despite the performance unpredictability, use ALLOW FILTERING"

To run the previous query successfully, you can create a secondary index on the race_year column.

This CREATE INDEX statement adds the index and then the query works:

CREATE INDEX ryear ON
cycling.rank_by_year_and_name (race_year);

SELECT *
FROM cycling.rank_by_year_and_name
WHERE race_year = 2015;

 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+-----------------
      2015 |   Giro d'Italia - Stage 11 - Forli > Imola |    1 |   Ilnur ZAKARIN
      2015 |   Giro d'Italia - Stage 11 - Forli > Imola |    2 | Carlos BETANCUR
      2015 | Tour of Japan - Stage 4 - Minami > Shinshu |    1 | Benjamin PRADES
      2015 | Tour of Japan - Stage 4 - Minami > Shinshu |    2 |     Adam PHELAN
      2015 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |    Thomas LEBAS

(5 rows)

Points to consider before using secondary indexes

Secondary indexes can impact performance when multiple nodes are accessed because the index table is stored on each node in a cluster. Before creating a secondary index, see when not to create an index. Guidelines:

In general, only add a secondary index a column with low cardinality, which means that the column contains a few unique values.
If you have a high cardinality column, consider creating a separate table to store the data in the column on which you want to create a secondary index.