How is data deleted?

How Cassandra deletes data and why deleted data can reappear.

Cassandra deletes data differently than a relational database does. A relational database might spend time scanning through data looking for expired data and throwing it away or an administrator might have to partition expired data by month. Data in a Cassandra column can have an optional expiration date called TTL (time to live). Use CQL to set the TTL in seconds for data. Cassandra marks TTL data with a tombstone after the requested amount of time has expired. A tombstone exists for gc_grace_seconds. After data is marked with a tombstone, the data is automatically removed during normal compaction.

Facts about deleted data to consider are:
  • Cassandra does not immediately remove data marked for deletion from disk. The deletion occurs during compaction.
  • If you use the SizeTieredCompactionStrategy or DateTieredCompactionStrategy, you can drop data immediately by manually starting the compaction process. Before doing so, understand the disadvantages of the process. If you force compaction, one potentially very large SSTable is created from all the data. Another compaction will not be triggered for a long time. The data in the SSTable created during the forced compaction can grow very stale during this long period of non-compaction.
  • Deleted data can reappear if you do not do repair routinely.

    Marking data with a tombstone signals Cassandra to retry sending a delete request to a replica that was down at the time of delete. If the replica comes back up within the grace period of time, it eventually receives the delete request. However, if a node is down longer than the grace period, the node can miss the delete because the tombstone disappears after gc_grace_seconds. Cassandra always attempts to replay missed updates when the node comes back up again. After a failure, it is a best practice to run node repair to repair inconsistencies across all of the replicas when bringing a node back into the cluster. If the node doesn't come back within gc_grace_seconds, remove the node, delete the node's data, and bootstrap it again.