Partition keys
Identify partition keys to limit the query and retrieve data from specific nodes.
Identifying the partition in a WHERE clause, allows the database to quickly retrieve the data from the appropriate node. Avoid running queries across multiple partitions whenever possible. The database requires that all partition keys are restricted or none. All the partition key columns are required to compute the hash and locate the node containing the partition.
If no restrictions are specified on the partition keys but some are specified on the clustering keys ALLOW FILTERING is required to execute the query.
- Equals (=)
- IN
- Ranges (>, >=, <, <=) on tokenized partition keys
Prerequisites
This section uses the rank_by_year_and_name example.
Procedure
-
Exact values using equals (=) operator
To filter on a regular or clustering columns, restrict all the partition key columns.
SELECT rank, cyclist_name as name FROM cycling.rank_by_year_and_name WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu' AND race_year = 2015 AND rank <= 2;
-
Values in a list for the last partition key column using the IN
operator
Use the IN operator on the last partition key column. For example, to return multiple years of the same race and stage:
SELECT rank, cyclist_name as name FROM cycling.rank_by_year_and_name WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu' AND race_year IN (2014, 2015);
Warning: Using IN query across partitions may cause performance issues, see Not using the “in” query for multiple partitions for a detailed explanation.Results are returned in the natural order of the column type.
-
Partition range using the TOKEN function with >, >=, <=, or <
operators
The database distributes the partition across the nodes using the selected partitioner. Only the ByteOrderedPartitioner keeps an ordered distribution of data. Select partitions in a range by tokenizing the partition keys and using greater than and less than operators.
To filter the results using a token range:SELECT TOKEN(race_year, race_name), race_name AS name, race_year AS year FROM cycling.rank_by_year_and_name WHERE TOKEN(race_year, race_name) >= -3074457345618258603 AND TOKEN(race_year, race_name) <= 3074457345618258602;