Tracking data expiration

The output of the sstable2json command reveals the life cycle of Cassandra data.

The output of the sstable2json command reveals the life cycle of Cassandra data. In this procedure, you use the sstable2json to view data in a row that is not scheduled to expire, data that has been evicted and marked with a tombstone, and a row that has had data removed from it.

Procedure

  1. Create the playlists table in the music keyspace as shown in Data modeling.
  2. Insert the row of data about ZZ Top in playlists:
    INSERT INTO music.playlists (id, song_order, song_id, title, artist, album)
      VALUES (62c36092-82a1-3a00-93d1-46196ee77204,
      1,
      a3e64f8f-bd44-4f28-b8d9-6938726e34d4,
      'La Grange',
      'ZZ Top',
      'Tres Hombres');
  3. Flush the data to disk.
    sudo ./nodetool flush music playlists
    You need to have access permission to the data directories to flush data to disk.
  4. Look at the JSON representation of the SSTable data, for example:
    sudo ./sstable2json
      /var/lib/cassandra/data/music/playlists/music-playlists-ib-1-Data.db

    Output is:

    [
    {"key": "62c3609282a13a0093d146196ee77204","columns": [[
    "1:","",1370179611971000], [
    "1:album","Tres Hombres",1370179611971000], [
    "1:artist","ZZ Top",1370179611971000], [
    "1:song_id","a3e64f8f-bd44-4f28-b8d9-6938726e34d4",1370179611971000], [
    "1:title","La Grange",1370179611971000]]}
    ]
  5. Specify the time-to-live (TTL) for the ZZ Top row, for example 300 seconds.
    INSERT INTO music.playlists
      (id, song_order, song_id, title, artist, album)
      VALUES (62c36092-82a1-3a00-93d1-46196ee77204,
      1,
      a3e64f8f-bd44-4f28-b8d9-6938726e34d4,
      'La Grange',
      'ZZ Top',
      'Tres Hombres')
      USING TTL 300;
    After inserting the TTL property on the row to expire the data, Cassandra marks the row with tombstones. You need to list all columns when re-inserting data if you want Cassandra to remove the entire row.
  6. Flush the data to disk again.
    Do this while the data is evicted, but before the time-to-live elapses and data is removed.
  7. Run the sstable2json command again.
    sudo ./sstable2json
      /var/lib/cassandra/data/music/playlists/music-playlists-ib-2-Data.db

    The tombstone markers--"e" followed by the TTL value, 300--are visible in the json representation of the data.

    [
    {"key": "62c3609282a13a0093d146196ee77204","columns": [[
    "1:","",1370179816450000,"e",300,1370180116], [
    "1:album","Tres Hombres",1370179816450000,"e",300,1370180116], [
    "1:artist","ZZ Top",1370179816450000,"e",300,1370180116], [
    "1:song_id","a3e64f8f-bd44-4f28-b8d9-6938726e34d4",1370179816450000,"e",300,1370180116], [
    "1:title","La Grange",1370179816450000,"e",300,1370180116]]}
    ]
  8. After the TTL elapses, flush the data to disk again.
  9. Run the sstable2json command again.
    $ sudo ./sstable2json
      /var/lib/cassandra/data/music/playlists/music-playlists-ib-2-Data.db

    The JSON representation of the column data shows that the tombstones and data values for the ZZ Top row have been deleted from the SSTable. The values are now marked with "d":

    [
    {"key":"62c3609282a13a0093d146196ee77204","columns": [[
    "1:","51ab4a14",1370179816450000,"d"], [
    "1:album","51ab4a14",1370179816450000,"d"], [
    "1:artist","51ab4a14",1370179816450000,"d"], [
    "1:song_id","51ab4a14",1370179816450000,"d"], [
    "1:title","51ab4a14",1370179816450000,"d"]]}
    ]