Defining a partition key with clustering columns
A compound primary key consists of a partition key that determines which node stores the data and of clustering columns that determine the order of the data on the partition.
NULL
value cannot be
inserted into a PRIMARY KEY
column. This restriction applies to
both partition keys and clustering columns.Remember that data is distributed throughout a cluster. An application can experience high
latency while retrieving data from a large partition if the entire partition must be read to
gather a small amount of data. On a physical node, when rows for a partition key are stored in
order based on the clustering columns, retrieval of rows is very efficient. Grouping data in
tables using a clustering column or columns is analogous to JOINs
in a
relational database, but clustering columns are much more performant because only one table is
accessed. This table uses category as the partition key and
points as the clustering column. Notice that for each
category, the points are ordered in descending
order.
The database stores an entire row of data on a node by partition key and can order the data for retrieval with clustering columns. Retrieving data from a partition is more versatile with clustering columns. For the example shown, a query could retrieve all point values greater than 200 for the One-day-races. If your environment has more complex needs for querying, use a compound primary key.