Secondary index (2i) examples

2i indexing can be used to index and query both primary key columns and non-primary key columns of almost any type. Find the type of index you want to create and follow the examples to create and query the index.

Each example starts with a table creation, followed by index creation, so that you can see the table schema and the index schema. Then sample queries for using the index are presented in a tabbed codeblock, to make it easier to examine queries side-by-side.

A keyspace must be created before tables and indexes can be created. The following keyspace is used for all examples on this page:

CREATE KEYSPACE IF NOT EXISTS cycling
WITH REPLICATION = {
  'class' : 'SimpleStrategy',
  'replication_factor' : 1
};

Primary key columns

Make an index on primary key columns, either composite partition key columns or clustering columns.

Single primary key column

This index creation will fail, because the table has a single PRIMARY KEY column. Such indexes are not allowed.

Create a table called cycling.alt_stats with a primary key if it doesn’t already exist.

CREATE TABLE IF NOT EXISTS cycling.cyclist_alt_stats (
  id UUID PRIMARY KEY,
  lastname text,
  birthday date,
  nationality text,
  weight float,
  w_units text,
  height float,
  first_race date,
  last_race date
);

Create a 2i index on the id column.

CREATE INDEX id_idx ON cycling.cyclist_alt_stats (id);
Results
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot create 
secondary index on the only partition key column id"

This index creation will fail, because the id column is the primary key, as well as the single partition key. Such indexes are not allowed.

Composite partition key columns in primary key

A composite partition key uses two or more columns to define the partition key in the PRIMARY KEY definition. To filter a query using a column that is not the full partition key, you must create a 2i index on the column that you wish to use for the query.

  1. Create a table called cycling.rank_by_year_and_name with a composite partition key and a clustering column if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.rank_by_year_and_name (
      race_year int,
      race_name text,
      cyclist_name text,
      rank int,
      PRIMARY KEY ((race_year, race_name), rank)
    );
  2. Create two 2i indexes on the race_year and race_name columns. These two columns are the composite partition key in the PRIMARY KEY definition.

    CREATE INDEX IF NOT EXISTS race_year_idx 
      ON cycling.rank_by_year_and_name (race_year);
    CREATE INDEX IF NOT EXISTS race_name_idx 
      ON cycling.rank_by_year_and_name (race_name);
  3. Verify the created indexes by checking the table schema with the DESCRIBE INDEX CQL command.

    DESCRIBE INDEX cycling.race_year_idx;
    DESCRIBE INDEX cycling.race_name_idx;
    Results
    CREATE INDEX race_year_idx ON cycling.rank_by_year_and_name (race_year);
    
    
    CREATE INDEX race_name_idx ON cycling.rank_by_year_and_name (race_name);

Now that you have created the table and indexes, try some SELECT queries.

  • The first example tab shows that a query with only one of the partition key columns fails if ALLOW FILTERING is not used.

  • The second example tab shows that a query with one of the partition key columns will succeed without ALLOW FILTERING.

  • The third example tab shows that a query with both of the partition key columns using AND will succeed without ALLOW FILTERING.

  • SELECT 1 - error

  • SELECT 1A - race_year

  • SELECT 2 - AND

Query the table using only the race_year`column. A similar query using only the `race_name column can also be performed.

SELECT * FROM cycling.rank_by_year_and_name
  WHERE race_year = 2014 
  AND race_name  = 'Tour of Japan - Stage 4 - Minami > Shinshu';
Results
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot execute 
this query as it might involve data filtering and thus may have unpredictable 
performance. If you want to execute this query despite the performance 
unpredictability, use ALLOW FILTERING"

For 2i indexes, you must use the ALLOW FILTERING directive when querying a table using a column that is not the full partition key.

To fix the error in SELECT 1, query the table using only the race_year`column, but add `ALLOW FILTERING.

SELECT * FROM cycling.rank_by_year_and_name
  WHERE race_year = 2014 
  AND race_name  = 'Tour of Japan - Stage 4 - Minami > Shinshu';
Results
 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+----------------------
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    1 |        Daniel MARTIN
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    2 | Johan Esteban CHAVES
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |      Benjamin PRADES

(3 rows)

Adding ALLOW FILTERING allows the query to succeed. However, this query is dangerous, because it requires a full table scan. SAI indexing is a better option that avoids this problem.

You may want to query the table using both the race_year and race_name columns. This query filters the results using both columns, so ALLOW FILTERING is not required.

SELECT * FROM cycling.rank_by_year_and_name
  WHERE race_year = 2014 
  AND race_name  = 'Tour of Japan - Stage 4 - Minami > Shinshu';
Results
 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+----------------------
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    1 |        Daniel MARTIN
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    2 | Johan Esteban CHAVES
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |      Benjamin PRADES

(3 rows)

Clustering column in primary key

A clustering column sorts the rows within a partition. Clustering columns are part of the PRIMARY KEY definition. To filter a query using just a clustering column, you must create a 2i index on the column that you wish to use for the query.

If you wish to use a clustering column in a query, it is best to consider restricting a query to a single partition. Using a partition key column plus a clustering column will have better performance.

  1. Create a table called cycling.rank_by_year_and_name with a composite partition key and a clustering column if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.rank_by_year_and_name (
      race_year int,
      race_name text,
      cyclist_name text,
      rank int,
      PRIMARY KEY ((race_year, race_name), rank)
    );
  2. Create a 2i index on the rank clustering column.

    CREATE INDEX IF NOT EXISTS rank_idx 
      ON cycling.rank_by_year_and_name (rank);
  3. Verify the created indexes by checking the schema with the DESCRIBE INDEX CQL command.

    DESCRIBE INDEX cycling.rank_idx;
    Results
    CREATE INDEX rank_idx ON cycling.rank_by_year_and_name (rank);

Now that you have created the table and index, try some queries.

  • The first example tab shows a query with only the clustering column.

  • The second example tab shows that a query with the clustering column and the partition key column fails without ALLOW FILTERING.

  • The third example tab shows a query with the clustering column and the partition key column succeeding with ALLOW FILTERING.

  • SELECT 1 - rank

  • SELECT 2 - error

  • SELECT 2A - AND

Query the table using only the `rank`column.

SELECT * FROM cycling.rank_by_year_and_name WHERE rank = 3;
Results
 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+----------------------
      2014 |                        4th Tour of Beijing |    3 | Johan Esteban CHAVES
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |      Benjamin PRADES
      2015 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |         Thomas LEBAS

(3 rows)

Query the table using the race_year and `rank`columns.

SELECT * FROM cycling.rank_by_year_and_name
  WHERE race_year = 2014 AND rank = 3;
Results
InvalidRequest: Error from server: code=2200 [Invalid query] 
message="Cannot execute this query as it might involve data filtering and thus 
may have unpredictable performance. If you want to execute this query despite 
the performance unpredictability, use ALLOW FILTERING"

Adding ALLOW FILTERING allows the query to succeed. However, this query is dangerous, because it requires a full table scan. SAI indexing is a better option that avoids this problem.

Fix the query the table using the race_year and rank columns by adding ALLOW FILTERING.

SELECT * FROM cycling.rank_by_year_and_name
  WHERE race_year = 2014 AND rank = 3
ALLOW FILTERING;
Results
 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+----------------------
      2014 |                        4th Tour of Beijing |    3 | Johan Esteban CHAVES
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |      Benjamin PRADES

(2 rows)

Non-primary key columns

Indexes can be created on non-primary key columns of almost any type.

Text column

  1. Create a table called cycling.alt_stats with TEXT column called nationality if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_alt_stats (
      id UUID PRIMARY KEY,
      lastname text,
      birthday date,
      nationality text,
      weight float,
      w_units text,
      height float,
      first_race date,
      last_race date
    );
  2. Create a 2i index on the nationality column, a non-primary key TEXT column.

    CREATE INDEX nationality_idx ON cycling.cyclist_alt_stats (nationality);
  3. Verify the created indexes by checking the index schema:

    CREATE INDEX nationality_idx ON cycling.cyclist_alt_stats (nationality);
    Results
    CREATE INDEX nationality_idx ON cycling.cyclist_alt_stats (nationality);

Now that you have created the table and index, query using a value for the column.

  1. Query the table using the nationality column.

    SELECT first_race, last_race, birthday FROM cycling.cyclist_alt_stats
      WHERE nationality = 'France';
    Results
     first_race | last_race  | birthday
    ------------+------------+------------
     2006-03-15 | 2017-04-16 | 1990-05-27
    
    (1 rows)

Date column

  1. Create a table called cycling.alt_stats with DATE column called birthday if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_alt_stats (
      id UUID PRIMARY KEY,
      lastname text,
      birthday date,
      nationality text,
      weight float,
      w_units text,
      height float,
      first_race date,
      last_race date
    );
  2. Create a 2i index on the birthday column, a non-primary key TEXT column.

    CREATE INDEX birthday_idx ON cycling.cyclist_alt_stats (birthday);
  3. Verify the created indexes by checking the index schema:

    CREATE INDEX birthday_idx ON cycling.cyclist_alt_stats (birthday);
    Results
    CREATE INDEX birthday_idx ON cycling.cyclist_alt_stats (birthday);

Now that you have created the table and index, query using a value for the column.

Query the table using the birthday column.

+

SELECT lastname, birthday, nationality FROM cycling.cyclist_alt_stats
  WHERE birthday = '1981-03-29';

+ .Results

Details
 lastname  | birthday   | nationality
-----------+------------+-------------
 VEIKKANEN | 1981-03-29 |     Finland

(1 rows)

Now that you have created indexes on the columns birthday and nationality, there are a number of other queries that you can perform.

  • The first example tab, SELECT 1 - AND, shows that a query with both the birthday and nationality columns will fail if ALLOW FILTERING is not used.

  • The second example tab, SELECT 2 - OR, shows that a query with either the birthday or nationality columns will not succeed because it is not supported.

  • The third example tab, SELECT 3 - IN, shows how to query with a list of nationality values.

  • The fourth example tab, SELECT 4 - RANGE, shows how to query with an inequality range of birthday values.

  • SELECT 1 - AND

  • SELECT 2 - OR

  • SELECT 3 - IN

  • SELECT 4 - RANGE

Query the table using both a nationality and a birthday.

Query:

SELECT lastname, birthday, nationality FROM cycling.cyclist_alt_stats
  WHERE birthday = '1991-08-25' AND nationality = 'Ethiopia'
ALLOW FILTERING;
Results
 lastname | birthday   | nationality
----------+------------+-------------
    GRMAY | 1991-08-25 |    Ethiopia

(1 rows)

The indexes have been created on appropriate low-cardinality columns, but querying with ALLOW FILTERING still fails. Why?

The answer lies with the partition key, which has not been defined. When you attempt a potentially expensive query, such as searching a range of rows, the database requires the ALLOW FILTERING directive. The error is not due to multiple indexes, but the lack of a partition key definition in the query.

Adding ALLOW FILTERING allows the query to succeed. However, this query is dangerous, because it requires a full table scan. SAI indexing is a better option that avoids this problem.

Query the table using a nationality or a birthday.

Query:

  SELECT lastname, birthday, nationality FROM cycling.cyclist_alt_stats
    WHERE birthday = '1991-08-25' OR nationality = 'Ethiopia'
  ALLOW FILTERING;

OR is not an allowed operator for 2i indexes.

Query the table using nationality with IN.

SELECT * FROM cycling.cyclist_alt_stats 
  WHERE nationality IN ('Finland', 'France')
ALLOW FILTERING;
Results
 id                                   | birthday   | first_race | height | last_race  | lastname  | nationality | w_units | weight
--------------------------------------+------------+------------+--------+------------+-----------+-------------+---------+--------
 c09e9451-50da-483d-8108-e6bea2e827b3 | 1981-03-29 | 1996-05-21 |   1.78 | 2012-10-02 | VEIKKANEN |     Finland |      kg |     66
 4ceb495c-55ab-4f71-83b9-81117252bb13 | 1990-05-27 | 2006-03-15 |   null | 2017-04-16 |     DUVAL |      France |    null |   null

(2 rows)

Query the table using a birthday range.

Query:

SELECT lastname, birthday, nationality FROM cycling.cyclist_alt_stats
  WHERE birthday >'1981-03-29'
ALLOW FILTERING;
Results
 lastname  | birthday   | nationality
-----------+------------+-------------
     BRUTT | 1982-01-29 |      Russia
  ISAYCHEV | 1986-04-21 |      Russia
 TSATEVICH | 1989-07-05 |      Russia
  PAULINHO | 1990-05-27 |    Portugal
     GRMAY | 1991-08-25 |    Ethiopia
    BELKOV | 1985-01-09 |      Russia
     DUVAL | 1990-05-27 |      France

(7 rows)

Inequalities are allowed for 2i indexes, but only with ALLOW FILTERING. Again, the difficulties of using 2is is illustrated. SAI is almost always a better option.

Integer column

An integer column can be indexed similarly to a text column.

  1. Create a table called cycling.rank_by_year_and_name with INT column called rank if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.rank_by_year_and_name (
      race_year int,
      race_name text,
      cyclist_name text,
      rank int,
      PRIMARY KEY ((race_year, race_name), rank)
    );
  2. Create a 2i index on the rank column, a non-primary key INT column.

    CREATE INDEX IF NOT EXISTS rank_idx 
      ON cycling.rank_by_year_and_name (rank);
  3. Verify the created indexes by checking the index schema.

    DESCRIBE INDEX cycling.rank_idx;
    Results
    CREATE INDEX rank_idx ON cycling.rank_by_year_and_name (rank);

Now that you have created the table and index, query using a value for the column.

  • SELECT 1 - equality

  • SELECT 2 - inequality range

Query the table using the rank column in an equality.

SELECT * FROM cycling.rank_by_year_and_name WHERE rank = 3;
Results
 race_year | race_name                                  | rank | cyclist_name
-----------+--------------------------------------------+------+----------------------
      2014 |                        4th Tour of Beijing |    3 | Johan Esteban CHAVES
      2014 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |      Benjamin PRADES
      2015 | Tour of Japan - Stage 4 - Minami > Shinshu |    3 |         Thomas LEBAS

(3 rows)

Query the table using the rank column in an inequality range.

SELECT * FROM cycling.rank_by_year_and_name WHERE rank < 3;
Results

Note that this will fail without ALLOW FILTERING with a 2i index.

Float column

Float columns can be indexed and searched.

  1. Create a table called cycling.cyclist_alt_stats with FLOAT column called weight if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_alt_stats (
      id UUID PRIMARY KEY,
      lastname text,
      birthday date,
      nationality text,
      weight float,
      w_units text,
      height float,
      first_race date,
      last_race date
    );
  2. Create a 2i index on the weight column, a non-primary key FLOAT column.

    CREATE INDEX weight_idx ON cycling.cyclist_alt_stats (weight);
  3. Verify the created indexes by checking the table schema.

    Results
    CREATE INDEX weight_idx ON cycling.cyclist_alt_stats (weight);

Now that you have created the table and index, query using a value for the column.

  • The first example tab shows that a query with the weight column can be performed using an equality.

  • The second example tab shows that a query with the weight column can be performed using an inequality range.

  • SELECT 1 - equality

  • SELECT 2 - inequality range

Query the table using the weight column in an equality.

Query:

SELECT lastname, nationality, weight FROM cycling.cyclist_alt_stats 
 WHERE weight = 70 ALLOW FILTERING;
Results
 lastname | nationality | weight
----------+-------------+--------

(0 rows)

Query the table using the weight column in an inequality range.

Query:

SELECT lastname, nationality, weight FROM cycling.cyclist_alt_stats 
 WHERE weight > 70 ALLOW FILTERING;
Results
 lastname | nationality | weight
----------+-------------+--------
 ISAYCHEV |      Russia |     80
   BELKOV |      Russia |     71

(2 rows)

Timestamp column

Timestamp columns can be indexed and searched.

  1. Create a table called cycling.comments with TIMESTAMP column called created_at if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.comments (
      record_id timeuuid,
      id uuid,
      commenter text,
      comment text,
      created_at timestamp,
      PRIMARY KEY (id, created_at)
    )
    WITH CLUSTERING ORDER BY (created_at DESC);
  2. Create a 2i index on the comments column, a non-primary key TIMESTAMP column.

    CREATE INDEX IF NOT EXISTS created_at_idx ON cycling.comments (created_at);
  3. Verify the created indexes by checking the index schema.

    DESCRIBE INDEX cycling.created_at_idx;
    Results
    CREATE INDEX created_at_idx ON cycling.comments (created_at);

Now that you have created the table and index, query using a value for the column.

  • The first example tab shows that a query with the created_at column can be performed using an equality.

  • The second example tab shows that a query with the created_at column can be performed using an inequality range.

  • SELECT 1 - equality

  • SELECT 2 - inequality range

Query the table using the created_at column in an equality.

SELECT commenter, comment, created_at FROM cycling.comments
  WHERE created_at = '2024-06-07 03:13:40.268';
Results
 commenter | comment              | created_at
-----------+----------------------+---------------------------------
       Amy | Ride in any weather! | 2024-06-07 03:13:40.268000+0000

(1 rows)

Query the table using the created_at column in an inequality range.

SELECT commenter, comment, created_at FROM cycling.comments
  WHERE created_at < '2024-06-07 03:13:40.268'
Results
Intenational failure to show that 2i does not support range queries

For 2i, note that the ALLOW FILTERING directive is required for inequality queries or the query will fail. Try it out yourself!

Tuple column

Tuples columns can be indexed and searched, but only the full tuple can be indexed.

  1. Create a table called cycling.nation_rank with TUPLE column called info if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.nation_rank (
      nation text PRIMARY KEY,
      info tuple<int, text, int>
    );
  2. Create a 2i index on the info column, a non-primary key TUPLE column.

    CREATE INDEX info_idx ON cycling.nation_rank (info);
  3. Verify the created indexes by checking the table schema.

    DESCRIBE INDEX cycling.info_idx;
    Results
    CREATE TABLE cycling.nation_rank (
        nation text PRIMARY KEY,
        info frozen<tuple<int, text, int>>
    ) WITH additional_write_policy = '99PERCENTILE'
        AND bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
        AND comment = ''
        AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
        AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND nodesync = {'enabled': 'true', 'incremental': 'true'}
        AND read_repair = 'BLOCKING'
        AND speculative_retry = '99PERCENTILE';
    
    
    CREATE INDEX info_idx ON cycling.nation_rank (info);

Now that you have created the table and index, query using a value for the column.

Query the table using the info column in an equality. Tuples can only be filtered using the full tuple value.

SELECT * FROM cycling.nation_rank
  WHERE info = (3, 'Phillippe GILBERT', 6222);
Results
 nation  | info
---------+--------------------------------
 Belgium | (3, 'Phillippe GILBERT', 6222)

(1 rows)

Non-primary key collection column

Collections can be indexed and queried to find a collection containing a particular value. Sets and lists are indexed a bit differently from maps, given the key-value nature of maps.

Sets and lists can index all values found by indexing the collection column. Maps can index a map key, map value, or map entry using the methods shown below. Multiple indexes can be created on the same map column in a table so that map keys, values, or entries can be queried. In addition, frozen collections can be indexed using FULL to index the full content of a frozen collection.

All the cautions about using secondary indexes apply to indexing collections.

Set

Create an index on a set to find all the cyclists that have been on a particular team.

  1. Create a table called cycling.cyclist-career-teams if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_career_teams (
      id UUID PRIMARY KEY,
      lastname text,
      teams set<text>
    );
  2. Create a 2i index on the teams column.

    CREATE INDEX IF NOT EXISTS teams_idx ON cycling.cyclist_career_teams (teams);
  3. Verify the created indexes by checking the index schema.

    DESCRIBE INDEX cycling.teams_idx;
    Results
    CREATE INDEX teams_idx ON cycling.cyclist_career_teams (values(teams));

Now that you have created the table and index, query using a value for the column.

Query the table using the teams column with a CONTAINS phrase to get the rows where a set contains a particular value.

SELECT lastname, teams FROM cycling.cyclist_career_teams 
  WHERE teams CONTAINS 'Rabobank-Liv Giant';
Results
 lastname | teams
----------+------------------------------------------------------------------------------------------------------
      VOS | {'Nederland bloeit', 'Rabobank Women Team', 'Rabobank-Liv Giant', 'Rabobank-Liv Woman Cycling Team'}
    BRAND |   {'AA Drink - Leontien.nl', 'Leontien.nl', 'Rabobank-Liv Giant', 'Rabobank-Liv Woman Cycling Team'}

(2 rows)

List

Create an index on a set to find all the cyclists that have a particular sponsor.

  1. Create a table called cycling.race_sponsors if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.race_sponsors (
      race_year int,
      race_name text,
      sponsorship list<text>,
      PRIMARY KEY (race_year, race_name)
    );
  2. Create a 2i index on the sponsorship list column.

    CREATE INDEX sponsorship_idx ON cycling.race_sponsors (sponsorship);
  3. Verify the created index by checking the index schema.

    DESCRIBE INDEX cycling.sponsorship_idx;
    Results
    CREATE INDEX sponsorship_idx ON cycling.race_sponsors (values(sponsorship));

Now that you have created the table and index, query using a value for the column.

Query the table using the sponsorship column with a CONTAINS phrase to get the rows where a set contains a particular value.

SELECT * FROM cycling.race_sponsors WHERE sponsorship CONTAINS 'Carrefour';
Results
 race_year | race_name      | sponsorship
-----------+----------------+-------------------------------------------------
      2018 | Tour de France | ['LCL', 'Carrefour', 'Skoda', 'Vittel', 'Krys']

(1 rows)

For map collections, create an index on the map key, map value, or map entry.

Map - keys

Create an index on a map key to find all cyclist/team combinations for a particular year.

  1. Create a table called cycling.cyclist_teams if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_teams (
      id uuid PRIMARY KEY,
      firstname text,
      lastname text,
      teams map<int, text>
    );
  2. Create a 2i index on the teams map column.

    CREATE INDEX IF NOT EXISTS team_keys_idx
      ON cycling.cyclist_teams ( KEYS (teams) );
  3. Verify the created 2i index by checking the index schema.

    DESCRIBE index cycling.team_keys_idx;
    Results
    CREATE INDEX team_keys_idx ON cycling.cyclist_teams (keys(teams));

Now that you have created the table and index, query using a key for the map column.

Query the table using the teams column with a CONTAINS KEY phrase to get the rows where a map contains a particular key.

SELECT firstname,lastname,teams FROM cycling.cyclist_teams 
  WHERE teams CONTAINS KEY 2015;
Results
 firstname | lastname | teams
-----------+----------+------------------------------------------------------------------------------------
  Marianne |      VOS | {2014: 'Rabobank-Liv Woman Cycling Team', 2015: 'Rabobank-Liv Woman Cycling Team'}

(1 rows)

Note that not all map columns are designed to make search via the map key useful. For example, look at this next query.

  1. Create a table called cycling.birthday if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.birthday_list (
      cyclist_name text PRIMARY KEY, 
      blist map<text, text>
    );
  2. Create a SAI index on the blist map column.

    CREATE INDEX IF NOT EXISTS blist_keys_idx 
      ON cycling.birthday_list ( KEYS(blist) );

Verify the created index by checking the index schema.

+

DESCRIBE INDEX cycling.blist_keys_idx;

+ .Results

Details
CREATE INDEX blist_keys_idx ON cycling.birthday_list (keys(blist));

Now that you have created the table and index, query using a key for the map column.

Query the table using the blist column with a CONTAINS KEY phrase to get the rows where a map contains a particular key.

SELECT * FROM cycling.birthday_list WHERE blist CONTAINS KEY 'age';
Results
 cyclist_name     | blist
------------------+--------------------------------------------------------------
   Claudio HEINEN |     {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Claudio VANDELLI |       {'age': '54', 'bday': '27/07/1961', 'nation': 'ITALY'}
    Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
      Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}
      Allan DAVIS |   {'age': '35', 'bday': '27/07/1980', 'nation': 'AUSTRALIA'}
 Laurence BOURQUE |      {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(6 rows)

In this case, the map key is not a useful index because the map key is not a unique identifier, but the same value for all rows.

Map - values

Create an index on the map values and find cyclists who have a particular value found in the specified map.

  1. Create a table called cycling.cyclist_teams if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_teams (
      id uuid PRIMARY KEY,
      firstname text,
      lastname text,
      teams map<int, text>
    );
  2. Create a 2i index on the teams map column.

    CREATE INDEX IF NOT EXISTS team_values_idx
      ON cycling.cyclist_teams ( VALUES (teams) );
  3. Verify the created index by checking the index schema.

    DESCRIBE index cycling.team_values_idx;
    Results
    CREATE INDEX team_values_idx ON cycling.cyclist_teams (values(teams));

Now that you have created the table and index, query using a value for the map column.

SELECT firstname,lastname,teams FROM cycling.cyclist_teams
WHERE teams CONTAINS 'Team Garmin - Cervelo';
Results
 firstname | lastname | teams
-----------+----------+-------

(0 rows)

Because the values of a map column are commonly unique, querying on a map value is more generally useful.

  1. Create a table called cycling.birthday if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.birthday_list (
      cyclist_name text PRIMARY KEY, 
      blist map<text, text>
    );
  2. Create a 2i index on the blist map column.

    CREATE INDEX IF NOT EXISTS blist_values_idx 
      ON cycling.birthday_list ( VALUES(blist) );
  3. Verify the created index by checking the table schema.

    DESCRIBE INDEX cycling.blist_values_idx;
    Results
    CREATE INDEX blist_values_idx ON cycling.birthday_list (values(blist));

Now that you have created the table and index, query using a value for the map column.

SELECT * FROM cycling.birthday_list WHERE blist CONTAINS 'NETHERLANDS';
Results
 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)

Map - entries

Create an index on the map entries and find cyclists who are the same age. An index using ENTRIES is only valid for maps.

  1. Create a table called cycling.birthday if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.birthday_list (
      cyclist_name text PRIMARY KEY, 
      blist map<text, text>
    );
  2. Create a 2i index on the blist map column.

    CREATE INDEX IF NOT EXISTS team_entries_idx
      ON cycling.cyclist_teams ( ENTRIES (teams) );
  3. Verify the created index by checking the index schema.

    DESCRIBE index cycling.team_entries_idx;
    Results
    CREATE INDEX team_entries_idx ON cycling.cyclist_teams (entries(teams));

Now that you have created the table and index, query using an entry for the map column.

  • SELECT 1

  • SELECT 2

  • DATE

  • AND

Query the table using the age column with a CONTAINS ENTRY phrase to get the rows where a map contains a particular entry.

SELECT * FROM cycling.birthday_list WHERE blist[ 'age' ] = '23';
Results
 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)

Using the same index, find cyclists from the same country.

SELECT * FROM cycling.birthday_list WHERE blist[ 'nation' ] = 'NETHERLANDS';
Results
 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)

Any number of different queries can be made using the ENTRIES index, such as selecting a matching entry with a DATE data type.

SELECT * FROM cycling.birthday_list WHERE blist [ 'bday' ] = '27/07/1992'
ALLOW FILTERING;
Results
 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)

Two different matches can be made using AND and the ENTRIES index.

SELECT * FROM cycling.birthday_list WHERE blist[ 'nation' ] = 'CANADA' 
  AND blist[ 'age' ] = '23' ALLOW FILTERING;
Results
// tag::table-drop[]
// end::table-drop[]

// tag::table-create[]
// end::table-create[]

// tag::table-describe-base[]

CREATE TABLE cycling.birthday_list (
    cyclist_name text PRIMARY KEY,
    blist map<text, text>
) WITH additional_write_policy = '99PERCENTILE'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND nodesync = {'enabled': 'true', 'incremental': 'true'}
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99PERCENTILE';

// end::table-describe-base[]

// tag::index-drop-blist-keys-idx[]
// end::index-drop-blist-keys-idx[]

// tag::index-drop-blist-values-idx[]
// end::index-drop-blist-values-idx[]

// tag::index-drop-blist-entries-idx[]
// end::index-drop-blist-entries-idx[]

// tag::index-create-blist-keys-idx-v1[]
// end::index-create-blist-keys-idx-v1[]

// tag::index-describe-blist-keys-idx-v1[]

CREATE INDEX blist_keys_idx ON cycling.birthday_list (keys(blist));

// end::index-describe-blist-keys-idx-v1[]

// tag::index-create-blist-values-idx-v1[]
// end::index-create-blist-values-idx-v1[]

// tag::index-describe-blist-values-idx-v1[]

CREATE INDEX blist_values_idx ON cycling.birthday_list (values(blist));

// end::index-describe-blist-values-idx-v1[]

// tag::index-create-blist-entries-idx-v1[]
// end::index-create-blist-entries-idx-v1[]

// tag::index-describe-blist-entries-idx-v1[]

CREATE INDEX blist_entries_idx ON cycling.birthday_list (entries(blist));

// end::index-describe-blist-entries-idx-v1[]

// tag::data-insert-one[]
// end::data-insert-one[]

// tag::data-insert-all[]
// end::data-insert-all[]

// tag::select-age-equals-23-2i[]

 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)
// end::select-age-equals-23-2i[]

// tag::select-birthday-equals-27071992-2i[]

 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)
// end::select-birthday-equals-27071992-2i[]

// tag::select-nation-equals-NETHERLANDS-2i[]

 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)
// end::select-nation-equals-NETHERLANDS-2i[]

// tag::select-nation-equals-CANADA-and-age-equals-23-2i[]

 cyclist_name     | blist
------------------+---------------------------------------------------------
 Laurence BOURQUE | {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(1 rows)
// end::select-nation-equals-CANADA-and-age-equals-23-2i[]

// tag::select-key-is-age-2i[]

 cyclist_name     | blist
------------------+--------------------------------------------------------------
   Claudio HEINEN |     {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Claudio VANDELLI |       {'age': '54', 'bday': '27/07/1961', 'nation': 'ITALY'}
    Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
      Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}
      Allan DAVIS |   {'age': '35', 'bday': '27/07/1980', 'nation': 'AUSTRALIA'}
 Laurence BOURQUE |      {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(6 rows)
// end::select-key-is-age-2i[]

// tag::select-nation-contains-NETHERLANDS-2i[]

 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)
// end::select-nation-contains-NETHERLANDS-2i[]

// tag::index-create-blist-keys-idx-v3[]
// end::index-create-blist-keys-idx-v3[]

// tag::index-describe-blist-keys-idx-v3[]

CREATE CUSTOM INDEX blist_keys_idx ON cycling.birthday_list (keys(blist)) USING 'StorageAttachedIndex';

// end::index-describe-blist-keys-idx-v3[]

// tag::index-create-blist-values-idx-v3[]
// end::index-create-blist-values-idx-v3[]

// tag::index-describe-blist-values-idx-v3[]

CREATE CUSTOM INDEX blist_values_idx ON cycling.birthday_list (values(blist)) USING 'StorageAttachedIndex';

// end::index-describe-blist-values-idx-v3[]

// tag::index-create-blist-entries-idx-v3[]
// end::index-create-blist-entries-idx-v3[]

// tag::index-describe-blist-entries-idx-v3[]

CREATE CUSTOM INDEX blist_entries_idx ON cycling.birthday_list (entries(blist)) USING 'StorageAttachedIndex';

// end::index-describe-blist-entries-idx-v3[]

// tag::data-insert-one[]
// end::data-insert-one[]

// tag::data-insert-all[]
// end::data-insert-all[]
// tag::select-all[]

 cyclist_name     | blist
------------------+--------------------------------------------------------------
   Claudio HEINEN |     {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Claudio VANDELLI |       {'age': '54', 'bday': '27/07/1961', 'nation': 'ITALY'}
    Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
      Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}
      Allan DAVIS |   {'age': '35', 'bday': '27/07/1980', 'nation': 'AUSTRALIA'}
 Laurence BOURQUE |      {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(6 rows)
// end::select-all[]

// tag::select-age-equals-23[]

 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)
// end::select-age-equals-23[]

// tag::select-birthday-equals-27071992[]

 cyclist_name     | blist
------------------+----------------------------------------------------------
   Claudio HEINEN | {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Laurence BOURQUE |  {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(2 rows)
// end::select-birthday-equals-27071992[]

// tag::select-nation-equals-NETHERLANDS[]

 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)
// end::select-nation-equals-NETHERLANDS[]

// tag::select-nation-equals-CANADA-and-age-equals-23[]

 cyclist_name     | blist
------------------+---------------------------------------------------------
 Laurence BOURQUE | {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(1 rows)
// end::select-nation-equals-CANADA-and-age-equals-23[]

// tag::select-key-is-age[]
 cyclist_name     | blist
------------------+--------------------------------------------------------------
   Claudio HEINEN |     {'age': '23', 'bday': '27/07/1992', 'nation': 'GERMANY'}
 Claudio VANDELLI |       {'age': '54', 'bday': '27/07/1961', 'nation': 'ITALY'}
    Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
      Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}
      Allan DAVIS |   {'age': '35', 'bday': '27/07/1980', 'nation': 'AUSTRALIA'}
 Laurence BOURQUE |      {'age': '23', 'bday': '27/07/1992', 'nation': 'CANADA'}

(6 rows)
// end::select-key-is-age[]

// tag::select-nation-contains-NETHERLANDS[]
 cyclist_name  | blist
---------------+--------------------------------------------------------------
 Luc HAGENAARS | {'age': '28', 'bday': '27/07/1987', 'nation': 'NETHERLANDS'}
   Toine POELS | {'age': '52', 'bday': '27/07/1963', 'nation': 'NETHERLANDS'}

(2 rows)
// end::select-nation-contains-NETHERLANDS[]

// tag::select-json[]
 [json]
----------------------------------------------------------------------------------------------------------
    {"cyclist_name": "Claudio HEINEN", "blist": {"age": "23", "bday": "27/07/1992", "nation": "GERMANY"}}
    {"cyclist_name": "Claudio VANDELLI", "blist": {"age": "54", "bday": "27/07/1961", "nation": "ITALY"}}
 {"cyclist_name": "Luc HAGENAARS", "blist": {"age": "28", "bday": "27/07/1987", "nation": "NETHERLANDS"}}
   {"cyclist_name": "Toine POELS", "blist": {"age": "52", "bday": "27/07/1963", "nation": "NETHERLANDS"}}
     {"cyclist_name": "Allan DAVIS", "blist": {"age": "35", "bday": "27/07/1980", "nation": "AUSTRALIA"}}
   {"cyclist_name": "Laurence BOURQUE", "blist": {"age": "23", "bday": "27/07/1992", "nation": "CANADA"}}

(6 rows)
// end::select-json[]

The queries using a map entry where the key is a unique identifier do look slightly different. Study the example below and compare it to the previous example.

  1. Create a table called cycling.cyclist-teams if it doesn’t already exist.

    CREATE TABLE IF NOT EXISTS cycling.cyclist_teams (
      id uuid PRIMARY KEY,
      firstname text,
      lastname text,
      teams map<int, text>
    );
  2. Create a 2i index on the teams map column.

    CREATE INDEX IF NOT EXISTS team_entries_idx
      ON cycling.cyclist_teams ( ENTRIES (teams) );
  3. Verify the created index by checking the index schema.

    DESCRIBE index cycling.team_entries_idx;
    Results
    CREATE INDEX team_entries_idx ON cycling.cyclist_teams (entries(teams));

Now query the table.

  • SELECT 1

  • AND

Query the table using both a key and a value to get the rows where a map contains a particular entry.

SELECT firstname,lastname,teams FROM cycling.cyclist_teams
  WHERE teams[2014] = 'Boels:Dolmans Cycling Team' ALLOW FILTERING;
Results
 firstname | lastname | teams
-----------+----------+-------

(0 rows)

Two different matches can be made using AND and the ENTRIES index.

SELECT firstname,lastname,teams FROM cycling.cyclist_teams
  WHERE teams[2014] = 'Boels:Dolmans Cycling Team'
    AND teams[2015] = 'Boels:Dolmans Cycling Team'
ALLOW FILTERING;
Results
 firstname | lastname | teams
-----------+----------+-------

(0 rows)

Frozen collection

Create an index on the full content of a FROZEN list. The table in this example stores the number of Pro wins, Grand Tour races, and Classic races that a cyclist has competed in. The SELECT statement finds any cyclist who has 39 Pro race wins, 7 Grand Tour starts, and 14 Classic starts.

The FROZEN keyword is required to index the full content of a frozen collection and applies to any collection type: set, list, or map.

  1. Create a table called race_starts if it does not yet exist.

    CREATE TABLE IF NOT EXISTS cycling.race_starts (
      cyclist_name text PRIMARY KEY,
      rnumbers FROZEN<LIST<int>>
    );
  2. Create a 2i index on the rnumbers frozen list column.

CREATE INDEX IF NOT EXISTS rnumbers_idx ON cycling.race_starts ( FULL(rnumbers) );
  1. Verify the created index by checking the index schema.

    DESCRIBE INDEX cycling.rnumbers_idx;
    Results
    CREATE INDEX rnumbers_idx ON cycling.race_starts (full(rnumbers));

Now that you have created the table and index, query using a value for the column.

Query the table using the rnumbers column with a CONTAINS phrase to get the rows where a frozen list contains a particular value:

SELECT * FROM cycling.race_starts WHERE rnumbers = [39, 7, 14];
Results
 cyclist_name   | rnumbers
----------------+-------------
 John DEGENKOLB | [39, 7, 14]

(1 rows)

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com