Initialize single-token architecture datacenters

Follow these steps only when not using virtual nodes (vnodes).

In most circumstances, each workload type, such as search, analytics, and transactional, should be organized into separate virtual datacenters. Workload segregation avoids contention for resources. However, workloads can be combined in SearchAnalytics nodes when there is not a large demand for analytics, or when analytics queries must use a DSE Search index. Generally, combining transactional (OLTP) and analytics (OLAP) workloads results in decreased performance.

When creating a keyspace using CQL, DSE creates a virtual datacenter for a cluster, even a one-node cluster, automatically. You assign nodes that run the same type of workload to the same datacenter. The separate, virtual datacenters for different types of nodes segregate workloads that run DSE Search from those nodes that run other workload types.

Prerequisites

Complete the tasks outlined in Initialize a DataStax Enterprise cluster to prepare the environment.

Procedure

These steps provide information about setting up a cluster having one or more datacenters.

Suppose you install DataStax Enterprise (DSE) on these nodes:
- node0 10.168.66.41 (seed1)
- node1 10.176.43.66
- node2 10.168.247.41
- node3 10.176.170.59 (seed2)
- node4 10.169.61.170
- node5 10.169.30.138

Calculate the token assignments as described in Calculating tokens for single-token architecture nodes.

The following tables list tokens for a 6 node cluster with a single datacenter or two datacenters.

Single Datacenter
Node	Token
node0	0
node1	21267647932558653966460912964485513216
node2	42535295865117307932921825928971026432
node3	63802943797675961899382738893456539648
node4	85070591730234615865843651857942052864
node5	106338239662793269832304564822427566080

Multiple Datacenters
Node	Token	Offset	Datacenter
node0	0	NA	DC1
node1	56713727820156410577229101238628035242	NA	DC1
node2	113427455640312821154458202477256070485	NA	DC1
node3	100	100	DC2
node4	56713727820156410577229101238628035342	100	DC2
node5	113427455640312821154458202477256070585	100	DC2

If the nodes are behind a firewall, open the required ports for internal/external communication.
If DSE is running, stop the node and clear the data:
- Package installations: To stop DSE:
  sudo service dse stop
  To remove data from the default directories:
  sudo rm -rf /var/lib/cassandra/*
- Tarball installations:
  
  From the installation location, stop the database:
  bin/dse cassandra-stop
  Remove all data:
  cd </var/lib/cassandra/data> && sudo rm -rf data/* commitlog/* saved_caches/* hints/*
Configure properties in cassandra.yaml on each new node, following the configuration of the other nodes in the cluster.

Use the yaml_diff tool to review and make appropriate changes to the cassandra.yaml and dse.yaml configuration files.

Configure node properties.

initial_token: token_value_from_calculation
num_tokens: 1
-seeds: <internal_IP_address> of each seed node

Include at least one seed node from each datacenter. DataStax recommends more than one seed node per datacenter. Do not make all nodes seed nodes.
listen_address: <empty>

If not set, DSE asks the system for the local address, which is associated with its host name. In some cases, DSE does not produce the correct address, which requires specifying the listen_address.
auto_bootstrap: <false>

Add the bootstrap setting only when initializing a new cluster with no data.

endpoint_snitch: <snitch>

See endpoint_snitch and snitches.

Do not use the DseSimpleSnitch. The DseSimpleSnitch (default) is used only for single-datacenter deployments (or single-zone deployments in public clouds), and does not recognize datacenter or rack information.

Snitch	Configuration file
GossipingPropertyFileSnitch	cassandra-rackdc.properties file
Configuring the Amazon EC2 single-region snitch
Configuring Amazon EC2 multi-region snitch
Configuring the Google Cloud Platform snitch
PropertyFileSnitch	cassandra-topology.properties file

Snitch

Configuration file

GossipingPropertyFileSnitch

cassandra-rackdc.properties file

Configuring the Amazon EC2 single-region snitch

Configuring Amazon EC2 multi-region snitch

Configuring the Google Cloud Platform snitch

PropertyFileSnitch

cassandra-topology.properties file

If using a cassandra.yaml or dse.yaml file from a previous version, check the Upgrade Guide for removed settings.

Set the properties in the dse.yaml file as required by your use case.

In the cassandra-rackdc.properties (GossipingPropertyFileSnitch) or cassandra-topology.properties (PropertyFileSnitch) file, assign datacenter and rack names to the IP addresses of each node, and assign a default datacenter name and rack name for unknown nodes.

Migration information: The GossipingPropertyFileSnitch always loads cassandra-topology.properties when the file is present. Remove the file from each node on any new datacenter, or any datacenter migrated from the PropertyFileSnitch.

# Transactional Node IP=Datacenter:Rack
110.82.155.0=DC_Transactional:RAC1
110.82.155.1=DC_Transactional:RAC1
110.54.125.1=DC_Transactional:RAC2
110.54.125.2=DC_Analytics:RAC1
110.54.155.2=DC_Analytics:RAC2
110.82.155.3=DC_Analytics:RAC1
110.54.125.3=DC_Search:RAC1
110.82.155.4=DC_Search:RAC2

# default for unknown nodes
default=DC1:RAC1

After making any changes in the configuration files, you must the restart the node for the changes to take effect.

After you have installed and configured DataStax Enterprise (DSE) on all nodes, start the nodes sequentially, beginning with the seed nodes. After starting each node, allow a delay of at least the value specified in ring_delay_ms before starting the next node, to prevent a cluster imbalance.

Before starting a node, ensure that the previous node is up and running by verifying that it has a nodetool status of UN. Failing to do so will result in cluster imbalance that cannot be fixed later. Cluster imbalance can be visualised by running nodetool status $keyspace and by looking at the ownership column. A properly setup cluster will report ownership values similar to each other (±1%). That is, for keyspaces where the RF per DC is equal to allocate_tokens_for_local_replication_factor.

See allocate_tokens_for_local_replication_factor for more infomation:
- Package installations: Start DataStax Enterprise as a service
- Tarball installations: Start DataStax Enterprise as a stand-alone process
Check that the new cluster is up and running:
```
dsetool status
```
If DSE has problems starting, look for starting DSE troubleshooting and other articles in the Support Knowledge Center.

Results

Datacenter: Cassandra
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address         Load        Tokens    Owns    Host ID             Rack
UN 110.82.155.0    21.33 KB    256       33.3%   a9fa31c7-f3c0-...   RAC1
UN 110.82.155.1    21.33 KB    256       33.3%   f5bb416c-db51-...   RAC1
UN 110.82.155.2    21.33 KB    256       16.7%   b836748f-c94f-...   RAC1

Calculating tokens for single-token architecture nodes: When not using vnodes, use these steps to calculate tokens to evenly distribute data across a cluster.

Initialize single-token architecture datacenters

Prerequisites

Procedure

Results

Was this helpful?

Give Feedback