Initializing single-token architecture datacenters

Follow these steps only when not using virtual nodes (vnodes).

In most circumstances, each workload type, such as search, analytics, and transactional, should be organized into separate virtual datacenters. Workload segregation avoids contention for resources. However, workloads can be combined in SearchAnalytics nodes when there is not a large demand for analytics, or when analytics queries must use a DSE Search index. Generally, combining transactional (OLTP) and analytics (OLAP) workloads results in decreased performance.

When creating a keyspace using CQL, DataStax Enterprise creates a virtual datacenter for a cluster, even a one-node cluster, automatically. You assign nodes that run the same type of workload to the same datacenter. The separate, virtual datacenters for different types of nodes segregate workloads that run DSE Search from those nodes that run other workload types.

Where is the cassandra-topology.properties file?

The location of the cassandra-topology.properties file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-topology.properties

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra-topology.properties

Where is the cassandra.yaml file?

The location of the cassandra.yaml file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra.yaml

Tarball installations + Installer-No Services installations

<installation_location>/resources/cassandra/conf/cassandra.yaml

Where is the dse.yaml file?

The location of the dse.yaml file depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/dse.yaml

Tarball installations + Installer-No Services installations

<installation_location>/resources/dse/conf/dse.yaml

Where is the cassandra-rackdc.properties file?

The location of the cassandra-rackdc.properties depends on the type of installation:

Installation Type Location

Package installations + Installer-Services installations

/etc/dse/cassandra/cassandra-rackdc.properties

Tarball installations + Installer-No Services

<installation_location>/resources/cassandra/conf/cassandra-rackdc.properties

Prerequisites

Complete the tasks outlined in Initializing a DataStax Enterprise cluster to prepare the environment.

Procedure

These steps provide information about setting up a cluster having one or more datacenters.

  1. Suppose you install DataStax Enterprise (DSE) on these nodes:

    • node0 10.168.66.41 (seed1)

    • node1 10.176.43.66

    • node2 10.168.247.41

    • node3 10.176.170.59 (seed2)

    • node4 10.169.61.170

    • node5 10.169.30.138

  2. Calculate the token assignments as described in Calculating tokens for single-token architecture nodes.

    The following tables list tokens for a 6 node cluster with a single datacenter or two datacenters.

    Single Datacenter
    Node Token

    node0

    0

    node1

    21267647932558653966460912964485513216

    node2

    42535295865117307932921825928971026432

    node3

    63802943797675961899382738893456539648

    node4

    85070591730234615865843651857942052864

    node5

    106338239662793269832304564822427566080

    Multiple Datacenters
    Node Token Offset Datacenter

    node0

    0

    NA

    DC1

    node1

    56713727820156410577229101238628035242

    NA

    DC1

    node2

    113427455640312821154458202477256070485

    NA

    DC1

    node3

    100

    100

    DC2

    node4

    56713727820156410577229101238628035342

    100

    DC2

    node5

    113427455640312821154458202477256070585

    100

    DC2

  3. If the nodes are behind a firewall, open the required ports for internal/external communication.

  4. If DataStax Enterprise is running, stop the node and clear the data:

    • Package installations:

      To stop DSE:

      sudo service dse stop

      To remove data from the default directories:

      sudo rm -rf /var/lib/cassandra/*
    • Tarball installations:

      From the installation location, stop the database:

      bin/dse cassandra-stop

      Remove all data:

      cd /var/lib/cassandra/data &&
      sudo rm -rf data/* commitlog/* saved_caches/* hints/*
  5. Configure properties in cassandra.yaml on each new node, following the configuration of the other nodes in the cluster.

    Use the yaml_diff tool to review and make appropriate changes to the cassandra.yaml and dse.yaml configuration files.

    1. Configure node properties.

      • initial_token: token_value_from_calculation

      • num_tokens: 1

      • -seeds: internal_IP_address of each seed node

        Include at least one seed node from each datacenter. DataStax recommends more than one seed node per datacenter. Do not make all nodes seed nodes.

      • listen_address: empty

        If not set, DSE asks the system for the local address, which is associated with its host name. In some cases, DSE does not produce the correct address, which requires specifying the listen_address.

      • auto_bootstrap: false

        Add the bootstrap setting only when initializing a new cluster with no data.

      • endpoint_snitch: snitch

        Do not use the DseSimpleSnitch (default). The DseSimpleSnitch is used only for single-datacenter deployments (or single-zone deployments in public clouds), and does not recognize datacenter or rack information.

        Snitch configuration files
        Snitch Configuration file

        GossipingPropertyFileSnitch

        cassandra-rackdc.properties file

        Ec2Snitch

        Ec2MultiRegionSnitch

        GoogleCloudSnitch

        PropertyFileSnitch

        cassandra-topology.properties file

      • If using a cassandra.yaml or dse.yaml file from a previous version, check the Upgrade Guide for removed settings.

  1. Set the properties in the dse.yaml file as required by your use case.

  2. In the cassandra-rackdc.properties (GossipingPropertyFileSnitch) or cassandra-topology.properties (PropertyFileSnitch) file, assign datacenter and rack names to the IP addresses of each node, and assign a default datacenter name and rack name for unknown nodes.

    Migration information: The GossipingPropertyFileSnitch always loads cassandra-topology.properties when the file is present. Remove the file from each node on any new cluster, or any cluster migrated from the PropertyFileSnitch.

    # Transactional Node IP=Datacenter:Rack
    110.82.155.0=DC_Transactional:RAC1
    110.82.155.1=DC_Transactional:RAC1
    110.54.125.1=DC_Transactional:RAC2
    110.54.125.2=DC_Analytics:RAC1
    110.54.155.2=DC_Analytics:RAC2
    110.82.155.3=DC_Analytics:RAC1
    110.54.125.3=DC_Search:RAC1
    110.82.155.4=DC_Search:RAC2
    
    # default for unknown nodes
    default=DC1:RAC1

    After making any changes in the configuration files, you must the restart the node for the changes to take effect.

  3. After you have installed and configured DSE on all nodes, start the seed nodes one at a time, and then start the rest of the nodes:

  4. Check that your cluster is up and running:

    dsetool status

    If the cluster has problems starting, look for starting DSE troubleshooting and other articles in the Support Knowledge Center.

Results

Datacenter: Cassandra
                =======================
                Status=Up/Down
                |/ State=Normal/Leaving/Joining/Moving
                -- Address         Load        Tokens    Owns    Host ID             Rack
                UN 110.82.155.0    21.33 KB    256       33.3%   a9fa31c7-f3c0-...   RAC1
                UN 110.82.155.1    21.33 KB    256       33.3%   f5bb416c-db51-...   RAC1
                UN 110.82.155.2    21.33 KB    256       16.7%   b836748f-c94f-...   RAC1
Calculating tokens for single-token architecture nodes

When not using vnodes, use these steps to calculate tokens to evenly distribute data across a cluster.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com