Partition keys

Identifying the partition in a WHERE clause allows the database to quickly retrieve the data from the appropriate node. Avoid running queries across multiple partitions whenever possible. The database requires that all partition keys are restricted or none. All the partition key columns are required to compute the hash and locate the node containing the partition.

If no restrictions are specified on the partition keys but some are specified on the clustering keys, ALLOW FILTERING is required to execute the query.

DataStax recommends limiting queries to a single partition to avoid performance issues that occur with multi-partition operations. Performance issues can occur when using the IN operator, omitting a WHERE clause with logical statements that identifies the partition, or other operators that require the ALLOW FILTERING option. For more details, see ALLOW FILTERING explained.

Use the following operators for partition key logical statements:

  • Equals (=)

  • IN

  • Ranges (>, >=, <, <=) on tokenized partition keys

Prerequisites

These examples uses the rank_by_year_and_name example.

Exact values using equals (=) operator

To filter on a regular column or clustering columns, restrict all the partition key columns.

SELECT rank, cyclist_name AS name
FROM cycling.rank_by_year_and_name
WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu'
  AND race_year = 2015
  AND rank <= 2;

Values in a list for the last partition key column using the IN operator

Use the IN operator on the last partition key column. For example, to return multiple years of the same race and stage:

SELECT rank, cyclist_name as name
FROM cycling.rank_by_year_and_name
WHERE race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu'
  AND race_year IN (2014, 2015);

Using IN query across partitions might cause performance issues; see Not using the “in” query for multiple partitions for a detailed explanation.

Results are returned in the natural order of the column type.

Partition range using the TOKEN function with >, >=, <=, or < operators

The database distributes the partition across the nodes using the selected partitioner. Only the ByteOrderedPartitioner keeps an ordered distribution of data. Select partitions in a range by tokenizing the partition keys and using greater than and less than operators.

To filter the results using a token range:

SELECT
  TOKEN(race_year, race_name),
  race_name AS name,
  race_year AS year
FROM cycling.rank_by_year_and_name
WHERE TOKEN(race_year, race_name) >= -3074457345618258603
AND TOKEN(race_year, race_name) <= 3074457345618258602;

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com