Traffic between the clusters
Traffic between the source cluster and the destination cluster is managed with permits, priority, and configurable failover behavior for multi-datacenter operation.
Permits
Traffic between the source cluster and the destination cluster is managed with permits. When a permit cannot be acquired, the message is postponed and waits in the replication log until it is processed when a permit becomes available. Permits are global and not per destination.
To manage permits and set the maximum number of messages that can be replicated to all destinations simultaneously, use dse advrep conf
:
dse advrep conf update --permits 1000
The default is 1024.
A channel with a higher priority takes precedence in acquiring permits. Permits are required to transmit data from a source to a destination.
Priority and FIFO/LIFO enablement
The commit log is flushed from memory to disk, writing the data to the appropriate table.
A Capture-Data-Change (CDC) collection agent additionally filters the data written and creates replication log files on disk.
Each channel source table has a separate data directory created on disk into which data is appended each time the commit log is flushed, storing all the messages that are to be replicated to a destination.
Several replication log files may exist per source table at any given time.
Each file stores a contiguous time-slice, configurable with dse advrep conf update
command and the --collection-time-slice-width
option (default: 60 seconds).
A CDC transmission agent then sends the messages stored in the replication log files to the destination, where the data is processed and written to the appropriate database table.
The order in which source table data is transmitted can be altered with the priority
option when creating a channel, and the order in which a source table’s replication log files are read can be tuned with the --fifo-enabled
and --lifo-enabled
options.
The replication log files are processed according to the time and priority of the replication channel. Replication channel priorities are set per table, and determines how the transmission agent orders the transmission of replication log files from the source to the destination. The replication log files can be passed to the destination in either last in, first out (LIFO) or first in, first out (FIFO); FIFO is the default. If the newest messages should be read first, use LIFO; if the oldest messages should be read first, use FIFO. Once an individual replication log file is transmitted, the messages it contains are read FIFO. Both options, priority and read order, can be set during channel creation:
dse advrep --host 192.168.3.10 channel create --source-keyspace foo --source-table bar --source-id source1 --source-id-column source_id --destination mydest --destination-keyspace foo --destination-table bar --collection-enabled true --priority 1 --lifo-enabled
This example sets the channel for table foo.bar to the top priority of one, so that the table’s replication log files are transmitted before other table’s replication log files. It also sets the replication log files to be read from newest to oldest.
Configure automatic failover for hub clusters with multiple datacenters
DSE Advanced Replication uses the DSE Java driver load balancing policy to communicate with the hub cluster.
You can explicitly define the local datacenter for the datacenter-aware round robin policy (DCAwareRoundRobinPolicy
) that is used by the DSE Java driver.
You can enable or disable failover from a local datacenter to a remote datacenter. When multiple datacenter failover is configured and a local datacenter fails, data replication from the edge to the hub continues using the remote datacenter. Tune the configuration with these parameters:
driver-local-dc
-
For destination clusters with multiple datacenters, you can explicitly define the name of the datacenter that you consider local. Typically, this is the datacenter that is closest to the source cluster. This value is used only for clusters with multiple data enters.
driver-used-hosts-per-remote-dc
-
To use automatic failover for destination clusters with multiple datacenters, you must define the number of hosts per remote datacenter that the datacenter aware round robin policy (
DCAwareRoundRobinPolicy
) considers available. driver-allow-remote-dcs-for-local-cl
-
Set to true to enable automatic failover for destination clusters with multiple datacenters. The value of the
driver-consistency-level
parameter must beLOCAL_ONE
orLOCAL_QUORUM
.
To enable automatic failover with a consistency level of LOCAL_QUORUM
, use dse advrep destination update
:
dse advrep destination update --name mydest --driver-allow-remote-dcs-for-local-cl true --driver-consistency-level LOCAL_QUORUM
Destination mydest updated
Updated driver_allow_remote_dcs_for_local_cl from null to true
Updated driver_consistency_level from ONE to LOCAL_QUORUM