Change Data Capture (CDC) logging

Change Data Capture (CDC) logging captures changes to data.

Change Data Capture (CDC) logging captures and tracks data that has changed. CDC logging is configured per table, with limits on the amount of disk space to consume for storing the CDC logs. CDC logs use the same binary format as the commit log.

Upon flushing the memtable to disk, CommitLogSegments that contain data for CDC-enabled tables are moved to the configured cdc_raw directory. After the disk space limit is reached, CDC-enabled tables reject writes until space is freed.

CDC directory location

The location of the CDC directory depends on the type of installation:
Package installations /var/lib/cassandra/cdc_raw
Tarball installations /var/lib/cassandra/cdc_raw

cassandra.yaml

The location of the cassandra.yaml file depends on the type of installation:
Package installations /etc/dse/cassandra/cassandra.yaml
Tarball installations installation_location/resources/cassandra/conf/cassandra.yaml

Prerequisites

Before enabling CDC logging, define a plan for moving and consuming the CDC log information. DataStax recommends a physical device for the CDC log that is separate from the data directories.

Procedure

  1. Enable CDC logging and configure CDC directories and space in cassandra.yaml.
    For example, to enable CDC logging with default values:
    cdc_enabled: true
    cdc_total_space_in_mb: 4096
    cdc_free_space_check_interval_ms: 250
    cdc_raw_directory: /var/lib/cassandra/cdc_raw
  2. Optional: To enable CDC logging for a database table, create or alter the table with the table property.
    For example, to enable CDC logging on the cycling table:
    ALTER TABLE cycling WITH cdc=true;