Initializing single-token architecture datacenters
Follow these steps only when not using virtual nodes (vnodes).
In most circumstances, each workload type, such as search, analytics, and transactional, should be organized into separate virtual datacenters. Workload segregation avoids contention for resources. However, workloads can be combined in SearchAnalytics nodes when there is not a large demand for analytics, or when analytics queries must use a DSE Search index. Generally, combining transactional (OLTP) and analytics (OLAP) workloads results in decreased performance.
When creating a keyspace using CQL, DataStax Enterprise creates a virtual datacenter for a cluster, even a one-node cluster, automatically. You assign nodes that run the same type of workload to the same datacenter. The separate, virtual datacenters for different types of nodes segregate workloads that run DSE Search from those nodes that run other workload types.
Prerequisites
Complete the tasks outlined in Initializing a DataStax Enterprise cluster to prepare the environment. |
Procedure
These steps provide information about setting up a cluster having one or more datacenters.
-
Suppose you install DataStax Enterprise on these nodes:
-
node0 10.168.66.41 (seed1)
-
node1 10.176.43.66
-
node2 10.168.247.41
-
node3 10.176.170.59 (seed2)
-
node4 10.169.61.170
-
node5 10.169.30.138
-
-
Calculate the token assignments as described in Calculating tokens for single-token architecture nodes.
The following tables list tokens for a 6 node cluster with a single datacenter or two datacenters.
Single Datacenter Node Token node0
0
node1
21267647932558653966460912964485513216
node2
42535295865117307932921825928971026432
node3
63802943797675961899382738893456539648
node4
85070591730234615865843651857942052864
node5
106338239662793269832304564822427566080
Multiple Datacenters Node Token Offset Datacenter node0
0
NA
DC1
node1
56713727820156410577229101238628035242
NA
DC1
node2
113427455640312821154458202477256070485
NA
DC1
node3
100
100
DC2
node4
56713727820156410577229101238628035342
100
DC2
node5
113427455640312821154458202477256070585
100
DC2
-
If the nodes are behind a firewall, open the required ports for internal/external communication.
-
If DataStax Enterprise is running, stop the node and clear the data:
-
Package installations: To stop DSE:
sudo service dse stop
To remove data from the default directories:
sudo rm -rf /var/lib/cassandra/*
-
Tarball installations:
From the installation location, stop the database:
bin/dse cassandra-stop
Remove all data:
cd </var/lib/cassandra/data> && sudo rm -rf data/* commitlog/* saved_caches/* hints/*
-
-
Configure properties in
cassandra.yaml
on each new node, following the configuration of the other nodes in the cluster.
Use the yaml_diff tool to review and make appropriate changes to the cassandra.yaml
and dse.yaml
configuration files.
-
Configure node properties.
-
initial_token
: token_value_from_calculation -
num_tokens
: 1 -
-seeds
: <internal_IP_address> of each seed nodeInclude at least one seed node from each datacenter. DataStax recommends more than one seed node per datacenter. Do not make all nodes seed nodes.
-
listen_address
: <empty>If not set, DSE asks the system for the local address, which is associated with its host name. In some cases, DSE does not produce the correct address, which requires specifying the
listen_address
. -
auto_bootstrap
: <false>Add the bootstrap setting only when initializing a new cluster with no data.
-
endpoint_snitch
: <snitch>See endpoint_snitch and snitches.
Do not use the DseSimpleSnitch. The DseSimpleSnitch (default) is used only for single-datacenter deployments (or single-zone deployments in public clouds), and does not recognize datacenter or rack information.
Snitch Configuration file -
If using a
cassandra.yaml
ordse.yaml
file from a previous version, check the Upgrade Guide for removed settings.-
Set the properties in the
dse.yaml
file as required by your use case. -
In the
cassandra-rackdc.properties
(GossipingPropertyFileSnitch) orcassandra-topology.properties
(PropertyFileSnitch) file, assign datacenter and rack names to the IP addresses of each node, and assign a default datacenter name and rack name for unknown nodes.Migration information: The GossipingPropertyFileSnitch always loads
cassandra-topology.properties
when the file is present. Remove the file from each node on any new datacenter, or any datacenter migrated from the PropertyFileSnitch.# Transactional Node IP=Datacenter:Rack 110.82.155.0=DC_Transactional:RAC1 110.82.155.1=DC_Transactional:RAC1 110.54.125.1=DC_Transactional:RAC2 110.54.125.2=DC_Analytics:RAC1 110.54.155.2=DC_Analytics:RAC2 110.82.155.3=DC_Analytics:RAC1 110.54.125.3=DC_Search:RAC1 110.82.155.4=DC_Search:RAC2 # default for unknown nodes default=DC1:RAC1
After making any changes in the configuration files, you must the restart the node for the changes to take effect.
-
After you have installed and configured DataStax Enterprise on all nodes, start the nodes sequentially, beginning with the seed nodes. After starting each node, allow a delay of at least the value specified in
ring_delay_ms
before starting the next node, to prevent a cluster imbalance.Before starting a node, ensure that the previous node is up and running by verifying that it has a
nodetool status
ofUN
. Failing to do so will result in cluster imbalance that cannot be fixed later. Cluster imbalance can be visualised by runningnodetool status $keyspace
and by looking at the ownership column. A properly setup cluster will report ownership values similar to each other (±1%). That is, for keyspaces where the RF per DC is equal toallocate_tokens_for_local_replication_factor
.See allocate_tokens_for_local_replication_factor for more infomation:
-
Package installations: Starting DataStax Enterprise as a service
-
Tarball installations: Starting DataStax Enterprise as a stand-alone process
-
-
Check that the new cluster is up and running:
dsetool status
If DSE has problems starting, look for starting DSE troubleshooting and other articles in the Support Knowledge Center.
-
-
Results
Datacenter: Cassandra
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 110.82.155.0 21.33 KB 256 33.3% a9fa31c7-f3c0-... RAC1
UN 110.82.155.1 21.33 KB 256 33.3% f5bb416c-db51-... RAC1
UN 110.82.155.2 21.33 KB 256 16.7% b836748f-c94f-... RAC1
- Calculating tokens for single-token architecture nodes
-
When not using vnodes, use these steps to calculate tokens to evenly distribute data across a cluster.