Using a secondary index
Create a secondary index on a column.
Certain queries return an error. For example, you might want to see the cyclist rankings
for a particular year. The
rank_by_year_and_name
table has a composite
partition key with two columns, race_year
and race_name
;
this query fails because only the race_year
column is specified in the
WHERE
clause:SELECT * FROM cycling.rank_by_year_and_name WHERE race_year = 2015;
InvalidRequest: Error from server: code=2200 [Invalid query] message=
"Cannot execute this query as it might involve data filtering and
thus may have unpredictable performance. If you want to execute this
query despite the performance unpredictability, use ALLOW FILTERING"
To run the previous query successfully, you can create a secondary index on the
race_year
column.
This
CREATE INDEX
statement adds the index and then the query
works:
CREATE INDEX ryear ON cycling.rank_by_year_and_name (race_year);
SELECT * FROM cycling.rank_by_year_and_name WHERE race_year = 2015;
race_year | race_name | rank | cyclist_name
-----------+--------------------------------------------+------+-----------------
2015 | Giro d'Italia - Stage 11 - Forli > Imola | 1 | Ilnur ZAKARIN
2015 | Giro d'Italia - Stage 11 - Forli > Imola | 2 | Carlos BETANCUR
2015 | Tour of Japan - Stage 4 - Minami > Shinshu | 1 | Benjamin PRADES
2015 | Tour of Japan - Stage 4 - Minami > Shinshu | 2 | Adam PHELAN
2015 | Tour of Japan - Stage 4 - Minami > Shinshu | 3 | Thomas LEBAS
(5 rows)
Points to consider before using secondary indexes
Secondary indexes can impact performance when multiple nodes are accessed because the index table is stored on each node in a cluster. Before creating a secondary index, see when not to create an index. Guidelines:-
In general, only add a secondary index a column with low cardinality, which means that the column contains a few unique values.
-
If you have a high cardinality column, consider creating a separate table to store the data in the column on which you want to create a secondary index.