Compound keys and clustering
A compound primary key includes the partition key, which determines which node stores the data, and includes one or more additional columns that can be used to sort data within the partition.
A compound primary key consists of the partition key and one or more additional columns that determine clustering. The partition key determines which node stores the data. It is responsible for data distribution across the nodes. The additional columns determine per-partition clustering. Clustering is a storage engine process that sorts data within the partition.
The data for each partition is clustered by the remaining column or columns of the primary key definition. On a physical node, when rows for a partition key are stored in order based on the clustering columns, retrieval of rows is very efficient. For example, because the id in the playlists table is the partition key, all the songs for a playlist are clustered in the order of the remaining song_order column. The others columns are displayed in alphabetical order by Cassandra.
PRIMARYKEY (id, song_order));
Insertion, update, and deletion operations on rows sharing the same partition key for a table are performed atomically and in isolation.
You can query a single sequential set of data on disk to get the songs for a playlist.
SELECT * FROM playlists WHERE id = 62c36092-82a1-3a00-93d1-46196ee77204 ORDER BY song_order DESC LIMIT 50;
The output looks something like this:
Cassandra stores an entire row of data on a node by partition key. If you have too much data in a partition and want to spread the data over multiple nodes, use a composite partition key.