Initialize single-token architecture datacenters

These steps provide information about setting up a cluster having one or more datacenters. Follow these steps only when not using virtual nodes (vnodes).

In most circumstances, each workload type, such as search, analytics, and transactional, should be organized into separate virtual datacenters. Workload segregation avoids contention for resources. However, workloads can be combined in SearchAnalytics nodes when there is not a large demand for analytics, or when analytics queries must use a DSE Search index. Generally, combining transactional (OLTP) and analytics (OLAP) workloads results in decreased performance.

When creating a keyspace using CQL, DSE creates a virtual datacenter for a cluster, even a one-node cluster, automatically. You assign nodes that run the same type of workload to the same datacenter. The separate, virtual datacenters for different types of nodes segregate workloads that run DSE Search from those nodes that run other workload types.

Complete the tasks outlined in Initialize a DataStax Enterprise cluster to prepare the environment.

Calculate the token assignments for your single-architecture nodes.

Token calculation example

For this example, assume you installed DSE on 6 nodes. The following tables list tokens for this hypothetical 6 node cluster with a single datacenter or two datacenters:

Single datacenter
Node	IP	Token
node0 (seed1)	10.168.66.41	0
node1	10.176.43.66	21267647932558653966460912964485513216
node2	10.168.247.41	42535295865117307932921825928971026432
node3 (seed2)	10.176.170.59	63802943797675961899382738893456539648
node4	10.169.61.170	85070591730234615865843651857942052864
node5	10.169.30.138	106338239662793269832304564822427566080

Multiple datacenters
Node	IP	Token	Offset	Datacenter
node0 (seed1)	10.168.66.41	0	NA	Datacenter 1
node1	10.176.43.66	56713727820156410577229101238628035242	NA	Datacenter 1
node2	10.168.247.41	113427455640312821154458202477256070485	NA	Datacenter 1
node3 (seed2)	10.176.170.59	100	100	Datacenter 2
node4	10.169.61.170	56713727820156410577229101238628035342	100	Datacenter 2
node5	10.169.30.138	113427455640312821154458202477256070585	100	Datacenter 2

If the nodes are behind a firewall, open the required ports for internal/external communication.
If DSE is running, stop the node and clear the data:
- Package installations
- Tarball installations
To stop DSE:
sudo service dse stop
To remove data from the default directories:
sudo rm -rf /var/lib/cassandra/*
From the installation location, stop the database:
bin/dse cassandra-stop
Remove all data:
cd </var/lib/cassandra/data> && sudo rm -rf data/* commitlog/* saved_caches/* hints/*

Configure properties in cassandra.yaml on each new node, following the configuration of the other nodes in the cluster:

Use the yaml_diff tool to review and make appropriate changes to the cassandra.yaml and dse.yaml configuration files.

Configure node properties:

initial_token: token_value_from_calculation
num_tokens: 1
-seeds: <internal_IP_address> of each seed node

Include at least one seed node from each datacenter. DataStax recommends more than one seed node per datacenter. Do not make all nodes seed nodes.
listen_address: <empty>

If not set, DSE asks the system for the local address, which is associated with its host name. In some cases, DSE does not produce the correct address, which requires specifying the listen_address.
auto_bootstrap: <false>

Add the bootstrap setting only when initializing a new cluster with no data.

endpoint_snitch: <snitch>

See endpoint_snitch and snitches.

Do not use the DseSimpleSnitch. The DseSimpleSnitch (default) is used only for single-datacenter deployments (or single-zone deployments in public clouds), and does not recognize datacenter or rack information.

Snitch	Configuration file
GossipingPropertyFileSnitch	cassandra-rackdc.properties file
Configuring the Amazon EC2 single-region snitch
Configuring Amazon EC2 multi-region snitch
Configuring the Google Cloud Platform snitch
PropertyFileSnitch	cassandra-topology.properties file

Snitch

Configuration file

GossipingPropertyFileSnitch

cassandra-rackdc.properties file

Configuring the Amazon EC2 single-region snitch

Configuring Amazon EC2 multi-region snitch

Configuring the Google Cloud Platform snitch

PropertyFileSnitch

cassandra-topology.properties file

If using a cassandra.yaml or dse.yaml file from a previous version, check the upgrade guide for your previous and current version for removed settings.

Set the properties in the dse.yaml file as required by your use case.

Depending on your snitch type, edit the appropriate configuration file to assign datacenter and rack names to the IP addresses of each node, and assign a default datacenter name and rack name for unknown nodes.

# Transactional Node IP=Datacenter:Rack
110.82.155.0=DC_Transactional:RAC1
110.82.155.1=DC_Transactional:RAC1
110.54.125.1=DC_Transactional:RAC2
110.54.125.2=DC_Analytics:RAC1
110.54.155.2=DC_Analytics:RAC2
110.82.155.3=DC_Analytics:RAC1
110.54.125.3=DC_Search:RAC1
110.82.155.4=DC_Search:RAC2

# default for unknown nodes
default=DC1:RAC1

For the PropertyFileSnitch, these are set in the cassandra-topology.properties. For the GossipingPropertyFileSnitch, these are set in the cassandra-rackdc.properties.

The GossipingPropertyFileSnitch always loads cassandra-topology.properties when the file is present. Remove the file from each node on any new datacenter and from any datacenter migrated from the PropertyFileSnitch.
After making any changes in the configuration files, you must restart the node for the changes to take effect.

After you have installed and configured DataStax Enterprise (DSE) on all nodes, start the nodes sequentially, beginning with the seed nodes.

After starting each node, allow a delay of at least the duration of ring_delay_ms before starting the next node to prevent cluster imbalance.

Before starting a node, ensure that the previous node is up and running by verifying that it nodetool status returns UN (Up and Normal). Failing to do so can result in cluster imbalance that cannot be fixed later.

Cluster imbalance can be visualised by running nodetool status KEYSPACE_NAME and checking the Ownership column in the response. A properly configured cluster reports ownership values similar to each other, within 1 percent, for keyspaces where the replication factor per DC is equal to allocate_tokens_for_local_replication_factor.

Package installations: Start DataStax Enterprise as a service
Tarball installations: Start DataStax Enterprise as a standalone process

Check that the new cluster is up and running:

dsetool status

If DSE has problems starting, visit DataStax Support for troubleshooting articles on starting DSE.

Results

Datacenter: Cassandra
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address         Load        Tokens    Owns    Host ID             Rack
UN 110.82.155.0    21.33 KB    256       33.3%   a9fa31c7-f3c0-...   RAC1
UN 110.82.155.1    21.33 KB    256       33.3%   f5bb416c-db51-...   RAC1
UN 110.82.155.2    21.33 KB    256       16.7%   b836748f-c94f-...   RAC1

Initialize single-token architecture datacenters

Was this helpful?

Give Feedback