Steps for adding a datacenter to an existing cluster.
Steps for adding a datacenter to an existing cluster.
The location of the
cassandra.yaml file depends on the type of installation:
DataStax Enterprise 5.0 Installer-Services and package installations |
/etc/dse/cassandra/cassandra.yaml |
DataStax Enterprise 5.0 Installer-No Services and tarball installations |
install_location/resources/cassandra/conf/cassandra.yaml |
Cassandra package installations |
/etc/cassandra/cassandra.yaml |
Cassandra tarball installations |
install_location/resources/cassandra/conf/cassandra.yaml |
Procedure
-
To prevent the client from prematurely connecting to the new datacenter and to
ensure that the consistency level for
reads or writes does not query the new datacenter:
-
Make sure that the clients are configured to use the
DCAwareRoundRobinPolicy
.
-
Make sure that the clients point to an existing datacenter, so they
don't try to access the new datacenter, which may not have any
data.
-
If using a
QUORUM
consistency level, change to
LOCAL_QUORUM
.
-
If using the
ONE
consistency level, set to
LOCAL_ONE
.
See the programming instructions for your driver.
Warning: If client applications, including DSE
Search and DSE Analytics, are not properly configured, they may connect to
the new datacenter before the datacenter is ready. This results in
connection exceptions, timeouts, and/or inconsistent data.
-
Configure the keyspace and create the new datacenter:
-
Use ALTER KEYSPACE to use the
NetworkTopologyStrategy for the following keyspaces:
- All user-created
- system:
system_distributed
and
system_traces
- DataStax Enterprise:
- OpsCenter (if installed)
This step is required for multiple datacenter clusters because
nodetool rebuild (10) requires a replica of these keyspaces
in the specified source datacenter.
-
Ensure that the defined class for all datacenters is
NetworkTopologyStrategy.
You can use cqlsh to create or alter a keyspace:
ALTER KEYSPACE "sample_ks" WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy', 'ExistingDC' : 3 };
Note: Datacenter names are case sensitive. Verify the case of the using
utility, such as dsetool status
.
-
In the new datacenter, install Cassandra on each new node. Do not start the
service or restart the node.
-
Configure on each new node
following the configuration of the other nodes in the cluster:
-
Set other cassandra.yaml properties, such as
-seeds and endpoint_snitch, to
match the settings in the cassandra.yaml files on
other nodes in the cluster.
Properties to set:
- cluster_name:
- num_tokens:
recommended value: 256
- -seeds:
internal IP address of each seed node
In new
clusters. Seed nodes don't perform bootstrap (the process
of a new node joining an existing cluster.)
- listen_address:
If the node is a seed node, this address must match an IP address
in the seeds list. Otherwise, gossip communication fails because
it doesn't know that it is a seed.
If not set, Cassandra
asks the system for the local address, the one associated with
its hostname. In some cases Cassandra doesn't produce the
correct address and you must specify the
listen_address.
- rpc_address:listen address for client
connections
- endpoint_snitch:
name of snitch (See endpoint_snitch.) If you are changing snitches, see
Switching snitches.
- auto_bootstrap: false (Add this setting only
when initializing a clean node with no data.)
-
Use the following settings to configure the vnode token
allocation:
-
On each new node, add the new datacenter definition to the properties file for
the type of snitches used in the
cluster:
Note: Do not use the SimpleSnitch. The SimpleSnitch (default) is used only for single-datacenter deployments. It does not
recognize datacenter or rack information and can be used only for single-datacenter
deployments or single-zone in public clouds.
-
In the existing datacenters:
-
On some nodes, update the seeds property in the
file to include the seed
nodes in the new datacenter and restart those nodes. (Changes to the
cassandra.yaml file require restarting to take
effect.)
-
Add the new datacenter definition to the properties file for the type
of snitch used in the cluster (5). If
changing snitches, see Switching snitches.
-
Start Cassandra on one node
on each rack.
-
Rotate starting Cassandra through the racks until all the nodes are up.
-
After all nodes are running in the cluster and the client applications are
datacenter aware (1), use cqlsh to alter the keyspaces:
ALTER KEYSPACE "sample_ks" WITH REPLICATION =
{'class’: 'NetworkTopologyStrategy', 'ExistingDC':3, 'NewDC':3};
Warning: If client applications, including DSE
Search and DSE Analytics, are not properly configured, they may connect to
the new datacenter before the datacenter is ready. This results in
connection exceptions, timeouts, and/or inconsistent data.
-
Run nodetool
rebuild on each node in the new datacenter.
nodetool rebuild -- name_of_existing_data_center
CAUTION:
If you don't specify the existing datacenter
in the command line, the new nodes will appear to rebuild successfully, but
will not contain any data.
If you miss this step, requests to the new
datacenter with LOCAL_ONE or ONE consistency levels may fail if the
existing datacenters are not completely in-sync.
This step ensures that the new nodes recognize the existing datacenters in
the cluster.
You can run rebuild on one or more nodes at the same time. Run
on one node at a time to reduce the impact on the existing cluster. Run on
multiple nodes when the cluster can handle the extra I/O and network
pressure.
Results
The datacenters in the cluster are now replicating with each other.The location of the
cassandra.yaml file depends on the type of installation:
DataStax Enterprise 5.0 Installer-Services and package installations |
/etc/dse/cassandra/cassandra.yaml |
DataStax Enterprise 5.0 Installer-No Services and tarball installations |
install_location/resources/cassandra/conf/cassandra.yaml |
Cassandra package installations |
/etc/cassandra/cassandra.yaml |
Cassandra tarball installations |
install_location/resources/cassandra/conf/cassandra.yaml |