About reads
How Cassandra combines results from the active memtable and potentially mutliple SSTables to satisfy a read.
To satisfy a read, Cassandra must combine results from the active memtable and potentially mutliple SSTables.
First, Cassandra checks the Bloom filter. Each SSTable has a Bloom filter associated with it that checks the probability of having any data for the requested partition in the SSTable before doing any disk I/O.
If the Bloom filter does not rule out the SSTable, Cassandra checks the partition key cache and takes one of these courses of action:
- If an index entry is found in the cache:
- Cassandra goes to the compression offset map to find the compressed block having the data.
- Fetches the compressed data on disk and returns the result set.
- If an index entry is not found in the cache:
- Cassandra searches the partition summary to determine the approximate location on disk of the index entry.
- Next, to fetch the index entry, Cassandra hits the disk for the first time, performing a single seek and a sequential read of columns (a range read) in the SSTable if the columns are contiguous.
- Cassandra goes to the compression offset map to find the compressed block having the data.
- Fetches the compressed data on disk and returns the result set.