Using a compound primary key
Use a compound primary key to create columns that you can query to return sorted results.
Use a compound primary key to create multiple columns that you can use to query and
return sorted results. If our pro cycling example was designed in a relational
database, you would create a cyclists table with a foreign key to the races. In DataStax
Distribution of Apache Cassandra™, you denormalize the data
because joins are not performant in a distributed system. Later, other schema are
shown that improve database performance.
Important: A
NULL
value cannot be
inserted into a PRIMARY KEY
column. This restriction applies to
both partition keys and clustering columns.This example creates a cyclist_category table storing a cyclist's last name, ID, and points for each type of race category. The table uses category for the partition key and points for a single clustering column. This table can be queried to retrieve a list of cyclists and their points in a category, sorted by points.
A compound primary key table can be created in two different ways, as shown.
Procedure
-
To create a table having a compound primary key, use two or more columns as the
primary key. This example uses an additional clause
WITH CLUSTERING ORDER BY
to order the points in descending order. Ascending order is more efficient to store, but descending queries are faster due to the nature of the storage engine.cqlsh> USE cycling; CREATE TABLE cyclist_category ( category text, points int, id UUID, lastname text, PRIMARY KEY (category, points) ) WITH CLUSTERING ORDER BY (points DESC);
Note: The combination of the category and points uniquely identifies a row in the cyclist_category table. More than one row with the same category can exist as long as the rows contain different pointsvalues. -
The keyspace name can be used to identify the keyspace in the
CREATE TABLE
statement instead of theUSE
statement.CREATE TABLE cycling.cyclist_category ( category text, points int, id UUID, lastname text, PRIMARY KEY (category, points) ) WITH CLUSTERING ORDER BY (points DESC);
Note: In both of these examples, points is defined as a clustering column.