Managing invalid messages
During message replication, DSE Advanced Replicates attempts to manipulate the message to ensure successful replication. In some cases, replication might occur with only a subset of the data.
In other cases, replication fails when there are too many differences between the schema on the source cluster and the schema on the destination cluster. For example, schema incompatibilities occur when a column in the destination has a different type than the same column in the source, or a table in the source does not contain all the columns that form the primary key of the same table in the destination.
If a message cannot be replicated, a second transmission is tried. If replication still fails after the second try, the message is discarded and removed from the replication log. The replication log on the source cluster stores data in preparation for transmission to the destination cluster.
When a message is discarded, the CQL query string and the related error message are logged on the destination cluster. To define where to store the CQL strings and the error messages that are relevant to the failed message replication, use one of the following logging strategies:
-
SYSTEM_LOG
: Log the CQL query and the error message in the system log on the destination. -
CHANNEL_LOG
: Store the CQL query and the error message in files in/var/lib/cassandra/advrep/invalid_queries
on the destination. This is the default value. -
NONE
: Perform no logging.
For the channel logging strategy, a file is created in the channel log directory on the source node, following the pattern /var/lib/cassandra/advrep/invalid_queries/<keyspace>/<table>/<destination>/invalid_queries.log
where keyspace
, table
and destination
are:
-
keyspace
: keyspace name of the invalid query -
table
: table name of the invalid query -
destination
: destination cluster of the channel
The log file stores the following data that is relevant to the failed message replication:
-
time_bucket
: an hourly time bucket to prevent the database partition from getting too wide -
id
: a time based id (timeuuid) -
cql_string
: the CQL query string, explicitly specifies the original timestamp by including theUSING TIMESTAMP
option. -
error_msg
: the error message
Invalid messages are inserted by time in the log table.
Procedure
- Manage invalid messages using channel logging
-
To store the CQL query string and error message using a channel log, instead of the default system log location, specify the
invalid_message_log
configuration key asCHANNEL_LOG
:dse advrep conf update --invalid_message_log CHANNEL_LOG
- Manage invalid messages using system logging
-
-
To store the CQL query string and error message using a system log, instead of the default channel log location, specify the
invalid_message_log
configuration key asSYSTEM_LOG
:dse advrep conf update --invalid_message_log SYSTEM_LOG
-
To identify the problem, examine the error messages, the CQL query strings, and the schemas of the data on the source and the destination.
-
Take appropriate actions to resolve the incompatibility issues.
-