Recovering expired data caused by TTL year 2038 problem

Recover data that was removed due to a TTL timestamp later than the maximum supported TTL date.

Prior to DataStax Enterprise version 5.1.7 in the 5.1.X series and 5.0.12 in the 5.0.X series, there was no protection against INSERTS if the TTL expiration timestamp was after the maximum date that the storage engine could represent (2038-01-19T03:14:06+00:00). Before 5.1.7 or 5.0.12, if an expiration timestamps with a later date was inserted, the date calculation overflowed causing the data to expire immediately. Records expired by overflow are not queryable and are permanently removed after a compaction. This issue occurs only for INSERTs that have a long TTL value that is close to the maximum 630720000 seconds (20 years). The earliest possible date overflow for expiration timestamps is 2018-01-19T03:14:06+00:00. As time progresses, the maximum supported TTL gradually reduces as the date 2038-01-19T03:14:06+00:00 approaches.

To recover data with overflowed timestamps from SSTables that were backed up or that did not go through compaction, use the --reinsert-overflowed-ttl option, because tombstones might have been generated with the original timestamp.

Tip: To find out if an SSTable has an entry with overflowed expiration, inspect it with the sstablemetadata tool and look for a negative min local deletion time field. Back up SSTables in this condition immediately, as they are subject to data loss during compaction.

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml
Use only one of the following options.

Offline scrub

For offline scrub, use the sstablescrub command with the --reinsert-overflowed-ttl parameter to recover data from a backed up SSTable.

DSE version Link
DSE 6.7 sstablescrub --reinsert-overflowed-ttl
DSE 6.0 sstablescrub --reinsert-overflowed-ttl
DSE 5.1.7 and later sstablescrub --reinsert-overflowed-ttl
DSE 5.0.12 and later sstablescrub --reinsert-overflowed-ttl
DSE 4.8.16 sstablescrub --reinsert-overflowed-ttl

Online scrub

For online scrub, use the nodetool scrub command with the --reinsert-overflowed-ttl parameter to recover data from a table that has not gone through compaction.

DSE version Link
DSE 6.7 nodetool --reinsert-overflowed-ttl
DSE 6.0 nodetool --reinsert-overflowed-ttl
DSE 5.1.7 and later nodetool --reinsert-overflowed-ttl
DSE 5.0.12 and later nodetool --reinsert-overflowed-ttl
DSE 4.8.16 nodetool --reinsert-overflowed-ttl

Procedure

  • Use the offline scrub option to recover data from a backed up SSTable:
    1. Clone the data directory tree to another location. Keep only the folders and the contents of the system tables.
    2. Clone the configuration directory to another location. Set the data_file_directories property to the cloned data directory in the cloned cassandra.yaml.
    3. Copy the affected SSTables to the cloned data location of the affected table.
    4. Set the environment variable CASSANDRA_CONF=cloned_configuration_directory.
    5. Run the following command to update the table.
      sstablescrub --reinsert-overflowed-ttl keyspace_name table_name
    6. After the scrub is completed, copy the resulting SSTables to the original data directory.
    7. Use nodetool refresh on a live node to load the recovered SSTables.
  • Use the online scrub option to recover data from a table that has not gone through compaction:
    1. Disable compaction on the node.
      nodetool disableautocompaction
      Warning: This step is crucial. The data might be removed permanently during compaction.
    2. Copy the SSTables containing entries with overflowed expiration time to the data directory.
    3. Load the SSTables.
      nodetool refresh
    4. Run the scrub command with reinsert overflow option on the tables.
      nodetool scrub --reinsert-overflowed-ttl keyspace_name table_name
    5. For indexed tables, use the dsetool reload_core command on a search node to load and reindex the reinserted values:
      dsetool reload_core reindex=true keyspace_name.table_name
    6. Re-enable compactions after verifying that scrub recovered the missing entries.