Identifying the partition in a WHERE clause allows the database to quickly retrieve the data from the appropriate node. Avoid running queries across multiple partitions whenever possible. The database requires that all partition keys are restricted or none. All the partition key columns are required to compute the hash and locate the node containing the partition.
If no restrictions are specified on the partition keys but some are specified on the clustering keys, ALLOW FILTERING is required to execute the query.
|DataStax recommends limiting queries to a single partition to avoid performance issues that occur with multi-partition operations. Performance issues can occur when using the IN operator, omitting a WHERE clause with logical statements that identifies the partition, or other operators that require the ALLOW FILTERING option. For more details, see ALLOW FILTERING explained.|
Use the following operators for partition key logical statements:
Ranges (>, >=, <, <=) on tokenized partition keys
These examples uses the rank_by_year_and_name example.
To filter on a regular column or clustering columns, restrict all the partition key columns.
SELECT rank, cyclist_name AS name FROM cycling.rank_by_year_and_name WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu' AND race_year = 2015 AND rank <= 2;
Use the IN operator on the last partition key column. For example, to return multiple years of the same race and stage:
SELECT rank, cyclist_name as name FROM cycling.rank_by_year_and_name WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu' AND race_year IN (2014, 2015);
+ WARNING: Using IN query across partitions might cause performance issues; see Not using the “in” query for multiple partitions for a detailed explanation.
+ Results are returned in the natural order of the column type.
The database distributes the partition across the nodes using the selected partitioner. Only the ByteOrderedPartitioner keeps an ordered distribution of data. Select partitions in a range by tokenizing the partition keys and using greater than and less than operators.
+ To filter the results using a token range:
SELECT TOKEN(race_year, race_name), race_name AS name, race_year AS year FROM cycling.rank_by_year_and_name WHERE TOKEN(race_year, race_name) >= -3074457345618258603 AND TOKEN(race_year, race_name) <= 3074457345618258602;