Add a datacenter to a cluster using a designated datacenter as a data source
This procedure shows you how to add a new datacenter to an existing cluster using a designated datacenter as a data source.
In this example, you add datacenter DC4
to a cluster with existing datacenters DC1
, DC2
, and DC3
.
Prerequisites
-
An existing cluster with properly configured datacenters
-
The same version of HCD available for installation
-
Network connectivity between all datacenters
Datacenter naming requirements
This procedure requires an existing datacenter.
When naming your datacenter:
|
Prepare existing datacenters
Make sure all keyspaces use the NetworkTopologyStrategy
replication strategy:
-
Change replication strategy for application keyspaces:
ALTER KEYSPACE keyspace_name WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'DC1' : 3 };
-
Update these system keyspaces:
-
system_auth
: Stores authentication and authorization. -
system_distributed
: Stores repair history. -
system_traces
: Stores trace information when CQL tracing is enabled.Do not modify the replication strategy for other system keyspaces.
-
-
Verify replication strategy with
DESCRIBE SCHEMA
:DESCRIBE SCHEMA ;
CREATE KEYSPACE hcd_perf WITH replication = {'class': 'NetworkTopologyStrategy, 'DC1': '3'} AND durable_writes = true; ... CREATE KEYSPACE hcd_leases WITH replication = {'class': 'NetworkTopologyStrategy, 'DC1': '3'} AND durable_writes = true; ... CREATE KEYSPACE HCDFS WITH replication = {'class': 'NetworkTopologyStrategy, 'DC1': '3'} AND durable_writes = true; ... CREATE KEYSPACE hcd_security WITH replication = {'class': 'NetworkTopologyStrategy, 'DC1': '3'} AND durable_writes = true;
Install and configure new nodes
-
Install HCD on each node in the new datacenter.
-
Use the same version of HCD on all nodes in the cluster
-
Don’t start the service or restart the node yet
-
-
Configure
cassandra.yaml
on each new node:Essential configuration properties Property Value Description -seeds
<internal_IP_address> of each seed node
Include at least one seed node from each datacenter; 3 per datacenter is recommended
auto_bootstrap
true
(if present)This setting might be removed in newer versions
listen_address
<empty> or specific address
If not set, HCD uses the local address
endpoint_snitch
<snitch>
See the snitch configuration options below
Snitch configuration files Snitch Configuration file -
If using a
cassandra.yaml
file from a previous version, check the Upgrade Guide for removed settings.
-
-
Choose and configure node architecture (all nodes in the datacenter must use the same type):
-
Virtual node (vnode) architecture
-
Single-token architecture
-
Set
num_tokens
to 8 (recommended). -
Set
allocate_tokens_for_local_replication_factor
to your target replication factor. -
Comment out the
initial_token
property.
For more information, see Virtual node (vnode) configuration.
-
Generate an initial token for each node and set this value for the
initial_token
property. -
Comment out both
num_tokens
andallocate_tokens_for_local_replication_factor
.
For more information, see Add or replace single-token nodes.
-
-
Configure snitch properties in the appropriate file:
# Transactional Node IP=Datacenter:Rack 110.82.155.0=DC_Transactional:RAC1 110.82.155.1=DC_Transactional:RAC1 110.54.125.1=DC_Transactional:RAC2 110.54.125.2=DC_Analytics:RAC1 110.54.155.2=DC_Analytics:RAC2 110.82.155.3=DC_Analytics:RAC1 110.54.125.3=DC_Search:RAC1 110.82.155.4=DC_Search:RAC2 # default for unknown nodes default=dc_unknown:rac_unknown
GossipingPropertyFileSnitch
always loadscassandra-topology.properties
when the file is present.Only keep this file if you’re using
PropertyFileSnitch
. Otherwise, delete it from all nodes in all datacenters. -
Restart the node for the changes to take effect.
Update existing datacenters
-
On nodes in the existing datacenters:
-
Update the
-seeds
property incassandra.yaml
to include the seed nodes in the new datacenter. -
Add the new datacenter definition to the snitch configuration files.
If you need to change snitches, see Switching snitches.
-
Start and verify the new datacenter
-
Start nodes sequentially, beginning with the seed nodes:
-
Wait at least
ring_delay_ms
between starting each node -
Verify each node is up (
UN
status) before starting the next -
Starting nodes too quickly can cause permanent cluster imbalance
-
Package installations: Start HCD using Mission Control
-
Tarball installations: Start HCD as a stand-alone process
-
-
Verify the new datacenter is operational:
nodetool status
Datacenter: DC1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 10.200.175.11 474.23 KiB ? 7297d21e-a04e-... -9223372036854775808 RAC1 Datacenter: DC2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 10.200.175.113 518.36 KiB ? 2ff7d46c-f084-... -9223372036854775798 RAC1 Datacenter: DC3 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 10.200.175.111 961.56 KiB ? ac43e602-ef09-... -9223372036854775788 RAC1 Datacenter: DC4 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 10.200.175.114 361.56 KiB ? ac43e602-ef09-... -9223372036854775688 RAC1
Update replication and rebuild data
-
Update keyspace replication to include the new datacenter:
ALTER KEYSPACE keyspace_name WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'ExistingDC1' : 3, 'NewDC2' : 2 };
Replace
ExistingDC1
andNewDC2
with your actual datacenter names. -
Rebuild data in the new datacenter:
Basic rebuild commandnodetool rebuild -dc SOURCE_DATACENTER_NAME
Replace
SOURCE_DATACENTER_NAME
with the name of the datacenter you want to rebuild data from.Rebuild command options Option Command Standard rebuild
nodetool rebuild -dc DC1
Rack-specific rebuild
nodetool rebuild -dc DC1:RAC1
Background rebuild with logging
nohup nodetool rebuild -dc DC1 > rebuild.log 2>&1 &
-
Choose a rebuild strategy based on your priorities:
Rebuilds can be safely run in parallel, but this has potential performance tradeoffs. The nodes in the source datacenter are streaming data, which can impact application performance. Run tests within your environment and adjust parallelism and streaming throttling to achieve the optimal balance of speed and performance.
-
Minimize source load
-
Maximize rebuild speed
-
Balance performance
-
Run on one node at a time (sequential rebuilds)
-
Reduces the load on the source datacenter
-
Takes longer to complete the rebuild process
-
Run on multiple nodes simultaneously (parallel rebuilds)
-
Requires sufficient cluster capacity to handle extra I/O and network traffic
-
Completes rebuild faster
-
Adjust stream throttling with
nodetool setinterdcstreamthroughput
-
Distributes allocated bandwidth across operations
-
Balances speed and source datacenter performance
-
-
For rack-specific rebuilds, run the appropriate command on each rack:
-
On
RAC1
nodes inDC2
run:nodetool rebuild -dc DC1:RAC1
-
On
RAC2
nodes inDC2
run:nodetool rebuild -dc DC1:RAC2
-
On
RAC3
nodes inDC2
run:nodetool rebuild -dc DC1:RAC3
-
-
Monitor rebuild progress:
nodetool netstats
The
nodetool rebuild
command issues a JMX call to the HCD node and waits for the rebuild to finish before returning to the command line. Once the JMX call is invoked, the rebuild process continues to run on the server even if thenodetool
command stops.Monitor rebuild progress using
nodetool netstats
and by examining the data size of each node. Additional notes:-
The data load shown in
nodetool status
updates only after streaming completes -
The system logs streaming errors to
system.log
-
If a temporary failure occurs, running
nodetool rebuild
again skips already streamed ranges
-
-
Adjust stream throttling as needed:
-
Verify rebuild completion:
Search for
finished rebuild
in thesystem.log
of each node in the new datacenter. -
If you modified the inter-datacenter streaming throughput, reset it to the original setting.
-
Start the Mission Control Repair Service if necessary.