SAI indexing with collections
SAI supports collections of type map
, list
, and set
.
Collections allow you to group and store data together in a column.
Using these features with SAI indexes was introduced in DSE 6.8.3. Also, starting in DSE 6.8.5, you have the option of creating multiple SAI collection indexes with different map types on the same column.
In a relational database, a grouping such as a user’s multiple email addresses is achieved via many-to-one joined relationship between (for example) a user
table and an email
table.
DSE avoids joins between two tables by storing the user’s email addresses in a collection column in the user
table.
Each collection specifies the data type of the data held.
A collection is appropriate if the data for collection storage is limited. If the data has unbounded growth potential, like messages sent or sensor events registered every second, do not use collections. Instead, use a table with a compound primary key where data is stored in the clustering columns.
In CQL queries of database tables with SAI indexes, the CONTAINS
clauses are supported with, and specific to:
-
SAI collection maps with
keys
,values
, andentries
-
SAI collections with
list
andset
types
Using the set type
This example uses the following table and index:
CREATE CUSTOM INDEX IF NOT EXISTS teams_idx ON cycling.cyclist_career_teams (teams) USING 'StorageAttachedIndex';
An index is created on the set column teams
in the cyclist_career_teams
table.
Query with CONTAINS
from the set teams
column:
SELECT lastname, teams FROM cycling.cyclist_career_teams
WHERE teams CONTAINS 'Rabobank-Liv Giant';
Results
lastname | teams
----------+----------------------------------------------------------------------------------------------------
BRAND | {'AA Drink - Leontien.nl', 'Leontien.nl', 'Rabobank-Liv Giant', 'Rabobank-Liv Woman Cycling Team'}
(1 rows)
lastname | teams
----------+----------------------------------------------------------------------------------------------------
BRAND | {'AA Drink - Leontien.nl', 'Leontien.nl', 'Rabobank-Liv Giant', 'Rabobank-Liv Woman Cycling Team'}
(1 rows)
Using the list type
This example uses the following table and index:
CREATE TABLE IF NOT EXISTS cycling.upcoming_calendar (
year int,
month int,
events list<text>,
PRIMARY KEY (year, month)
);
CREATE CUSTOM INDEX events_idx ON cycling.upcoming_calendar (events)
USING 'StorageAttachedIndex';
An index is created on the list column events
in the upcoming_calendar
table.
Query with CONTAINS
from the list events
column:
SELECT * FROM cycling.upcoming_calendar
WHERE events CONTAINS 'Criterium du Dauphine';
Results
year | month | events
------+-------+----------------------------------------------------------------------------
2015 | 6 | ['Tour de France', 'Criterium du Dauphine', 'Vuelta Ciclista a Venezuela']
(1 rows)
year | month | events
------+-------+----------------------------------------------------------------------------
2015 | 6 | ['Tour de France', 'Criterium du Dauphine', 'Vuelta Ciclista a Venezuela']
(1 rows)
year | month | events
------+-------+----------------------------------------------------------------------------
2015 | 6 | ['Tour de France', 'Criterium du Dauphine', 'Vuelta Ciclista a Venezuela']
(1 rows)
A slightly more complex query selects rows that either contain a particular event or have a particular month date:
SELECT * FROM cycling.upcoming_calendar
WHERE events CONTAINS 'Criterium du Dauphine'
OR month = 7;
Results
Using the map type
This example uses the following table and indexes:
CREATE TABLE IF NOT EXISTS cycling.cyclist_teams (
id uuid PRIMARY KEY,
firstname text,
lastname text,
teams map<int, text>
);
CREATE CUSTOM INDEX IF NOT EXISTS team_keys_idx
ON cycling.cyclist_teams ( KEYS (teams) ) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX IF NOT EXISTS team_values_idx
ON cycling.cyclist_teams ( VALUES (teams) ) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX IF NOT EXISTS team_entries_idx
ON cycling.cyclist_teams ( ENTRIES (teams) ) USING 'StorageAttachedIndex';
Indexes created on the map column teams
in the cyclist_career_teams
table target the keys, values, and full entries of the column data.
Creating multiple SAI collection indexes with different map types on the same column requires DSE 6.8.4 or later.
If you’re using DSE 6.8.3 or earlier, submit a DROP INDEX <index-name>;
command before adding the next map type on the same column.
Query with KEYS
from the map teams
column:
SELECT firstname,lastname,teams FROM cycling.cyclist_teams
WHERE teams CONTAINS KEY 2015;
Results
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
Marianne | VOS | {2015: 'Rabobank-Liv Woman Cycling Team'}
(2 rows)
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
Marianne | VOS | {2015: 'Rabobank-Liv Woman Cycling Team'}
(2 rows)
Query a value from the map teams
column, noting that only the keyword CONTAINS
is included:
SELECT firstname,lastname,teams FROM cycling.cyclist_teams
WHERE teams CONTAINS 'Team Garmin - Cervelo';
Results
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Jamie | BENNETT | {2011: 'Team Garmin - Cervelo', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team'}
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(2 rows)
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Jamie | BENNETT | {2011: 'Team Garmin - Cervelo', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team'}
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(2 rows)
Query entries from the map teams
column, noting the difference in the WHERE
clause:
SELECT firstname,lastname,teams FROM cycling.cyclist_teams
WHERE teams CONTAINS 'Team Garmin - Cervelo';
Results
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Jamie | BENNETT | {2011: 'Team Garmin - Cervelo', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team'}
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(2 rows)
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Jamie | BENNETT | {2011: 'Team Garmin - Cervelo', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team'}
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(2 rows)
Query for two entries from the map teams
column:
SELECT firstname,lastname,teams FROM cycling.cyclist_teams
WHERE teams[2014] = 'Boels:Dolmans Cycling Team'
AND teams[2015] = 'Boels:Dolmans Cycling Team';
Results
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(1 rows)
firstname | lastname | teams
-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Elizabeth | ARMITSTEAD | {2011: 'Team Garmin - Cervelo', 2012: 'AA Drink - Leontien.nl', 2013: 'Boels:Dolmans Cycling Team', 2014: 'Boels:Dolmans Cycling Team', 2015: 'Boels:Dolmans Cycling Team'}
(1 rows)
For more information, see: