What are tombstones?

A tombstone is a marker in a table row that indicates deleted data, such as a cell, column, row, or entire table. During compaction, DataStax Enterprise (DSE) deletes the marked data.

An excessive number of tombstones in a table can negatively affect application performance and read operations. The presence of many tombstones can indicate issues with the data model or the application’s query patterns.

Tombstone lifecycle

DSE creates a tombstone when data is deleted from a keyspace that has a replication factor (RF) greater than 1. The following list of examples is not exhaustive, but illustrates some operations that generate tombstones:

Using a CQL DELETE statement.
Expiring data with time-to-live (TTL).
Using internal operations, such as Using materialized views.
INSERT or UPDATE operations with a null value.
UPDATE operations on a collection column.

When a tombstone is created, it can be marked on different parts of a partition. Based on the location of the marker, tombstones can be categorized into one of the following groups. Each category typically corresponds to one unique type of data deletion operation.

Partition tombstones
Row tombstones
Range tombstones
ComplexColumn tombstones
Cell tombstones
TTL tombstones

Tombstones go through the write path, and are written to SSTables on one or more nodes. A key differentiator of a tombstone is a built-in expiration known as the grace period, set by gc_grace_seconds. At the end of its expiration period, the tombstone will be deleted as part of the normal compaction process, if it no longer marks any deleted data in another SSTable.

Tombstone examples

In the following examples, the cycling keyspace is used to illustrate different tombstone categories. Two tables are used: rank_by_year_and_cycling_name and cyclist_career_teams.

Because the following examples use both cqlsh and CQL commands, using two different terminals is recommended. Alternatively, use a terminal for cqlsh, and then issue CQL commands using DataStax Studio.

If you want to follow along with the examples, you can create the sample tables in a test database or keyspace.

Create the cycling keyspace and both example tables:

CREATE KEYSPACE cycling WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;

CREATE TABLE cycling.rank_by_year_and_name (
    race_year int,
    race_name text,
    rank int,
    cyclist_name text,
    PRIMARY KEY ((race_year, race_name), rank)
) WITH CLUSTERING ORDER BY (rank ASC);

Insert data into the rank_by_year_and_cycling_name table:

INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2015, 'Tour of Japan - Stage 4 - Minami > Shinshu', 'Benjamin PRADES', 1);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2015, 'Tour of Japan - Stage 4 - Minami > Shinshu', 'Adam PHELAN', 2);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2015, 'Tour of Japan - Stage 4 - Minami > Shinshu', 'Thomas LEBAS', 3);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2015, 'Giro d''Italia - Stage 11 - Forli > Imola', 'Ilnur ZAKARIN', 1);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2015, 'Giro d''Italia - Stage 11 - Forli > Imola', 'Carlos BETANCUR', 2);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2014, '4th Tour of Beijing', 'Phillippe GILBERT', 1);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2014, '4th Tour of Beijing', 'Daniel MARTIN', 2);
INSERT INTO cycling.rank_by_year_and_name (race_year, race_name, cyclist_name, rank) VALUES (2014, '4th Tour of Beijing', 'Johan Esteban CHAVES', 3);

CREATE TABLE cycling.cyclist_career_teams (
    id UUID PRIMARY KEY,
    lastname text,
    teams set<text>
);

Later in the examples, you will insert data into the cyclist_career_teams table to test Cell tombstones and TTL tombstones.

While you work through the examples, you must flush data from the memtable to SSTables after each table modification. This is required for the sstabledump command to show the tombstone markers in the output.
1. Run the nodetool flush command on the cycling keyspace to flush data from the memtable to the SSTables on disk:
  nodetool flush cycling;
2. After flushing the cycling keyspace, run the sstabledump command on a specific SSTable. For example:
  cd /var/lib/cassandra/data/cycling/rank_by_year_and_name-bc05fba12baf11e8b4a8ad2b042f3e18 sstabledump mc-2-big-Data.db

Partition tombstones

Partition tombstones are generated when an entire partition is deleted explicitly. In the CQL DELETE statement, the WHERE clause is an equality condition against the partition key.

DELETE from cycling.rank_by_year_and_name WHERE
 race_year = 2014 AND race_name = '4th Tour of Beijing';

Looking at the sstabledump output for this partition, the deletion_info tombstone marker is at the partition level, and is not associated with any rows or cells within the partition.

{
    "partition" : {
      "key" : [ "2014", "4th Tour of Beijing" ],
      "position" : 0,
      "deletion_info" : { "marked_deleted" : "2018-05-16T19:40:06.454282Z", "local_delete_time" : "2018-05-16T19:40:06Z" }
    },
    "rows" : [ ]
  }

Row tombstones

Row tombstones are generated when a particular row within a partition is deleted explicitly. The schema has a composite primary key that includes both the partition key and the clustering key. In the CQL DELETE statement, the WHERE clause is an equality condition against both the partition key and the clustering key columns.

DELETE from cycling.rank_by_year_and_name WHERE
 race_year = 2015 AND race_name = 'Giro d''Italia - Stage 11 - Forli > Imola' AND rank = 2;

Looking at the sstabledump output for this partition, the deletion_info tombstone marker is at the row level, and is identified by a clustering key under the partition. Neither the partition nor the row cells contain the tombstone marker.

{
    "partition" : {
      "key" : [ "2015", "Giro d'Italia - Stage 11 - Forli > Imola" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 74,
        "clustering" : [ 2 ],
        "deletion_info" : { "marked_deleted" : "2018-05-18T15:29:06.227148Z", "local_delete_time" : "2018-05-18T15:29:06Z" },
        "cells" : [ ]
      }
    ]
  }

Range tombstones

Range tombstones occur when several rows within a partition that can be expressed through a range search are deleted explicitly. The schema has a composite primary key that includes both a partition key and a clustering key. In the CQL DELETE statement, the WHERE clause is an equality condition against the partition key, plus an inequality condition against the clustering key.

If following this tutorial from the beginning, drop the rank_by_year_and_name table and then re-create it to populate the table with the necessary data.

DELETE from cycling.rank_by_year_and_name WHERE
 race_year = 2015 AND race_name = 'Tour of Japan - Stage 4 - Minami > Shinshu' AND rank > 1;

Looking at the sstabledump output for this partition, the deletion_info tombstone marker is at the row level. A special boundary marker, range_tombstone_bound, marks the range scope (identified by the clustering key values) of the deleted rows.

{
    "partition" : {
      "key" : [ "2015", "Tour of Japan - Stage 4 - Minami > Shinshu" ],
      "position" : 252
    },
    "rows" : [
      {
        "type" : "range_tombstone_bound",
        "start" : {
          "type" : "inclusive",
          "deletion_info" : { "marked_deleted" : "2018-05-18T16:09:21.474713Z", "local_delete_time" : "2018-05-18T16:09:21Z" }
        }
      },
      {
        "type" : "range_tombstone_bound",
        "end" : {
          "type" : "exclusive",
          "clustering" : [ 1 ],
          "deletion_info" : { "marked_deleted" : "2018-05-18T16:09:21.474713Z", "local_delete_time" : "2018-05-18T16:09:21Z" }
        }
      }
    ]
  }

ComplexColumn tombstones

ComplexColumn tombstones are generated when inserting or updating a collection type column, such as set, list, and map.

Previously we created the cyclist_career_teams table. Run the following cqlsh command to insert data into that table.

INSERT INTO cycling.cyclist_career_teams (
     id,
     lastname,
     teams)
     VALUES (cb07baad-eac8-4f65-b28a-bddc06a0de23, 'ARMITSTEAD', {
     'Boels-Dolmans Cycling Team','AA Drink - Leontien.nl','Team Garmin - Cervelo' } );

Looking at the sstabledump output for this partition, no explicit manual deletion occurs on the partition, but a deletion_info marker is listed at the cell level for the collection type column teams.

{
    "partition" : {
      "key" : [ "cb07baad-eac8-4f65-b28a-bddc06a0de23" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 130,
        "liveness_info" : { "tstamp" : "2018-05-18T16:26:23.779724Z" },
        "cells" : [
          { "name" : "lastname", "value" : "ARMITSTEAD" },
          { "name" : "teams", "deletion_info" : { "marked_deleted" : "2018-05-18T16:26:23.779723Z", "local_delete_time" : "2018-05-18T16:26:23Z" } },
          { "name" : "teams", "path" : [ "AA Drink - Leontien.nl" ], "value" : "" },
          { "name" : "teams", "path" : [ "Boels-Dolmans Cycling Team" ], "value" : "" },
          { "name" : "teams", "path" : [ "Team Garmin - Cervelo" ], "value" : "" }
        ]
      }
    ]
  }

Cell tombstones

Cell tombstones are generated when explicitly deleting a value from a cell, such as a column for a specific row of a partition, or when inserting or updating a cell with null values, as shown in the following example.

INSERT INTO cycling.rank_by_year_and_name (
     race_year,
     race_name,
     cyclist_name,
     rank)
     VALUES (2018, 'Giro d''Italia - Stage 11 - Osimo > Imola', null, 1);

Looking at the sstabledump output for this partition, deletion_info tombstone marker is associated with a particular cell.

{
    "partition" : {
      "key" : [ "2018", "Giro d'Italia - Stage 11 - Osimo > Imola" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 80,
        "clustering" : [ 1 ],
        "liveness_info" : { "tstamp" : "2018-05-18T17:13:42.602827Z" },
        "cells" : [
          { "name" : "cyclist_name", "deletion_info" : { "local_delete_time" : "2018-05-18T17:13:42Z" }
          }
        ]
      }
    ]
  }

TTL tombstones

TTL tombstones are generated when the TTL (time-to-live) period expires. The TTL expiration marker can occur at either the row or cell level. However, DSE marks TTL data differently from tombstone data that was explicitly deleted. Even if a partition has only a single row (with no clustering key), the TTL mark is still made at the row level.

The following statement sets TTL for an entire row.

INSERT INTO cycling.cyclist_career_teams (
     id,
     lastname,
     teams)
     VALUES (e7cd5752-bc0d-4157-a80f-7523add8dbcd, 'VAN DER BREGGEN', {
     'Rabobank-Liv Woman Cycling Team','Sengers Ladies Cycling Team','Team Flexpoint' }) USING TTL 1;

The following statement sets TTL for a single cell.

UPDATE cycling.rank_by_year_and_name USING TTL 1
  SET cyclist_name = 'Cloudy Archipelago' WHERE race_year = 2018 AND
  race_name = 'Giro d''Italia - Stage 11 - Osimo > Imola' AND rank = 1;

Looking at the sstabledump output for these partitions, the first CQL statement marks the row (partition key: e7cd5752-bc0d-4157-a80f-7523add8dbcd) with an "expired" : true TTL expiration marker in the liveness_info section.

{
    "partition" : {
      "key" : [ "e7cd5752-bc0d-4157-a80f-7523add8dbcd" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 134,
        "liveness_info" : { "tstamp" : "2018-05-18T17:38:13.135226Z", "ttl" : 1, "expires_at" : "2018-05-18T17:38:14Z", "expired" : true },
        "cells" : [
          { "name" : "lastname", "value" : "VAN DER BREGGEN" },
          { "name" : "teams", "deletion_info" : { "marked_deleted" : "2018-05-18T17:38:13.135225Z", "local_delete_time" : "2018-05-18T17:38:13Z" } },
          { "name" : "teams", "path" : [ "Rabobank-Liv Woman Cycling Team" ], "value" : "" },
          { "name" : "teams", "path" : [ "Sengers Ladies Cycling Team" ], "value" : "" },
          { "name" : "teams", "path" : [ "Team Flexpoint" ], "value" : "" }
        ]
      }
    ]
  }

The second CQL statement marks the cell (partition key: 2018, clustering key: 1, column name: cyclist_name) with an "expired" : true TTL expiration marker for the specific cell.

{
    "partition" : {
      "key" : [ "2018", "Giro d'Italia - Stage 11 - Osimo > Imola" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 95,
        "clustering" : [ 1 ],
        "cells" : [
          { "name" : "cyclist_name", "value" : "Cloudy Archipelago", "tstamp" : "2018-05-18T18:22:52.532855Z", "ttl" : 1, "expires_at" : "2018-05-18T18:22:53Z", "expired" : true }
        ]
      }
    ]
  }

What are tombstones?

Tombstone lifecycle

Tombstone examples

Partition tombstones

Row tombstones

Range tombstones

ComplexColumn tombstones

Cell tombstones

TTL tombstones

See also

Was this helpful?

Give Feedback