Managing invalid messages

DSE Advanced Replication strategies for managing invalid messages when replication fails.

During message replication, DSE Advanced Replicates attempts to manipulate the message to ensure successful replication. In some cases, replication might occur with only a subset of the data.

In other cases, replication fails when there are too many differences between the schema on the source cluster and the schema on the destination cluster. For example, schema incompatibilities occur when a column in the destination has a different type than the same column in the source, or a table in the source doesn’t contain all the columns that form the primary key of the same table in the destination.

If a message cannot be replicated, a second transmission will be tried. If replication still fails after that the second try, the message is discarded and removed from the replication log. The replication log on the source cluster stores data in preparation for transmission to the destination cluster.

When a message is discarded, the CQL query string and the related error message are logged on the destination cluster. To define where to store the CQL strings and the error messages that are relevant to the failed message replication, use one of the following logging strategies:
  • SYSTEM_LOG: Log the CQL query and the error message in the system log on the destination.
  • CHANNEL_LOG: Store the CQL query and the error message in files in /var/lib/cassandra/advrep/invalid_queries on the destination. This is the default value.
  • NONE: Perform no logging.
For the channel logging strategy, a file is created in the channel log directory on the source node, following the pattern /var/lib/cassandra/advrep/invalid_queries/<keyspace>/<table>/<destination>/invalid_queries.log where keyspace, table and destination are:
  • keyspace: keyspace name of the invalid query
  • table: table name of the invalid query
  • destination: destination cluster of the channel
The log file stores the following data that is relevant to the failed message replication:
  • time_bucket: an hourly time bucket to prevent the database partition from getting too wide
  • id: a time based id (timeuuid)
  • cql_string: the CQL query string, explicitly specifies the original timestamp by including the USING TIMESTAMP option.
  • error_msg: the error message

Invalid messages are inserted by time in the log table.

Procedure

Manage invalid messages using channel logging:
  1. To store the CQL query string and error message using a channel log, instead of the default system log location, specify the invalid_message_log configuration key as CHANNEL_LOG:
    dse advrep conf update --invalid_message_log CHANNEL_LOG
Manage invalid messages using system logging:
  1. To store the CQL query string and error message using a system log, instead of the default channel log location, specify the invalid_message_log configuration key as SYSTEM_LOG:
    dse advrep conf update --invalid_message_log SYSTEM_LOG
  2. To identify the problem, examine the error messages, the CQL query strings, and the schemas of the data on the source and the destination.
  3. Take appropriate actions to resolve the incompatibility issues.