Configuring replication

Choose a data partitioner and replica placement strategy.

Cassandra performs replication to store multiple copies of data on multiple nodes for reliability and fault tolerance. To configure replication, you need to choose a data partitioner and replica placement strategy. Data partitioning determines how data is placed across the nodes in the cluster. For information about how this works, see Data distribution and replication. Nodes communicate with each other about replication and other things using the gossip protocol. Be sure to configure gossip, as described in About internode communications (gossip).

Virtual nodes 

Virtual nodes simplify many tasks in Cassandra, such as eliminating the need to determine the partition range (calculate and assign tokens), rebalancing the cluster when adding or removing nodes, and replacing dead nodes. For a complete description of virtual nodes and how they work, see About virtual nodes, and the Virtual nodes in Cassandra 1.2 blog.

Attention: DataStax Enterprise 4.0 turns off virtual nodes (vnodes) by default. DataStax does not recommend turning on vnodes for Hadoop or Solr nodes, but you can use vnodes for any Cassandra-only cluster, or a Cassandra-only data center in a mixed Hadoop/Solr/Cassandra deployment.

Using virtual nodes

In the cassandra.yaml, file uncomment num_tokens and leave the initial_token parameter unset. Guidelines for using virtual nodes include:

  • Determining the num_tokens value

    The initial recommended value for num_tokens is 256. For more guidance, see Setting up virtual nodes.

  • Migrating existing clusters

    To upgrade existing clusters to virtual nodes, see Enabling virtual nodes on an existing production cluster.

  • Using a mixed architecture

    Cassandra supports using virtual node-enabled and non-virtual node data centers. For example, a single cluster could have a cassandra only data center with vnodes enabled and a search data center without vnodes.

Disabling virtual nodes 

To disable virtual nodes:

  1. In the cassandra.yaml file, set num_tokens to 1.
    num_tokens: 1
  2. Uncomment the initial_token property and set it to 1 or to the value of a generated token for a multi-node cluster.

Using the single-token-per-node architecture in DSE 3.1 and above 

If you don't use virtual nodes, you must make sure that each node is responsible for roughly an equal amount of data. To do this, assign each node an initial-token value and calculate the tokens for each data center as described in Generating tokens located in the Datastax Enterprise 3.0 documentation. You can also use the Murmur3Partitioner and calculate the tokens as described in Cassandra 1.2 Generating tokens.

Partitioner settings 

You can use either the Murmur3Partitioner or RandomPartitioner with virtual nodes.

The Murmur3Partitioner (org.apache.cassandra.dht.Murmur3Partitioner) is the default partitioning strategy for new Cassandra clusters (1.2 and above) and the right choice for new clusters in almost all cases. You can only use Murmur3Partitioner for new clusters; you cannot change the partitioner in existing clusters. If you are switching to the 1.2 cassandra.yaml, be sure to change the partitioner setting to match the previous partitioner.

The RandomPartitioner (org.apache.cassandra.dht.RandomPartitioner) was the default partitioner prior to Cassandra 1.2. You can continue to use this partitioner when migrating to virtual nodes.

Snitch settings 

A snitch determines which data centers and racks are written to and read from. It informs Cassandra about the network topology so that requests are routed efficiently and allows Cassandra to distribute replicas by grouping machines into data centers and racks. All nodes must have exactly the same snitch configuration.

The following sections describe three commonly-used snitches. All available snitches are described in the Cassandra documentation. The default endpoint_snitch is the DseDelegateSnitch. The default snitch delegated by this snitch is the DseSimpleSnitch (org.apache.cassandra.locator.DseSimpleSnitch). You set the snitch used by the DseDelegateSnitch in the dse.yaml file:

  • Packaged installs: /etc/dse/dse.yaml
  • Tarball installs: install_location/resources/dse/conf/dse.yaml

DseSimpleSnitch

Use DseSimpleSnitch only for development in DataStax Enterprise deployments. This snitch logically configures each type of node in separate data centers to segregate the analytics, real-time, and search workloads.

When defining your keyspace, use Analytics, Cassandra, or Search for your data center names.

SimpleSnitch

For a single data center (or single node) cluster, the SimpleSnitch is usually sufficient. However, if you plan to expand your cluster at a later time to multiple racks and data centers, it is easier if you use a rack and data center aware snitch from the start, such as the RackInferringSnitch.

PropertyFileSnitch

The PropertyFileSnitch allows you to define your data center and rack names to be whatever you want. Using this snitch requires that you define network details for each node in the cluster in the cassandra-topology.properties configuration file.

  • Packaged installs: /etc/dse/cassandra/cassandra-topology.properties
  • Tarball installs: install_location/resources/cassandra/conf/cassandra-topology.properties

Every node in the cluster should be described in this file, and specified exactly the same on every node in the cluster.

For example, suppose you had non-uniform IPs and two physical data centers with two racks in each, and a third logical data center for replicating analytics data, you would specify them as follows:

# Data Center One

175.56.12.105=DC1:RAC1
175.50.13.200=DC1:RAC1
175.54.35.197=DC1:RAC1

120.53.24.101=DC1:RAC2
120.55.16.200=DC1:RAC2
120.57.102.103=DC1:RAC2

# Data Center Two

110.56.12.120=DC2:RAC1
110.50.13.201=DC2:RAC1
110.54.35.184=DC2:RAC1

50.33.23.120=DC2:RAC2
50.45.14.220=DC2:RAC2
50.17.10.203=DC2:RAC2

# Analytics Replication Group

172.106.12.120=DC3:RAC1
172.106.12.121=DC3:RAC1
172.106.12.122=DC3:RAC1

# default for unknown nodes 
default=DC3:RAC1

Make sure the data center names defined in the /etc/dse/cassandra/cassandra-topology.properties file correlates to what you name your data centers in your keyspace definition.

Choosing keyspace replication options 

When you create a keyspace, you must define the replica placement strategy class and the number of replicas you want. DataStax recommends choosing NetworkTopologyStrategy for single and multiple data center clusters. This strategy is as easy to use as the SimpleStrategy and allows for expansion to multiple data centers in the future. It is much easier to configure the most flexible replication strategy up front, than to reconfigure replication after you have already loaded data into your cluster.

NetworkTopologyStrategy takes as options the number of replicas you want per data center. Even for single data center clusters, you can use this replica placement strategy and just define the number of replicas for one data center. For example:

CREATE KEYSPACE test
    WITH  REPLICATION= { 'class' :  'NetworkTopologyStrategy',  'us-east' : 6 };

For a single node cluster, use the default data center name, Cassandra, Solr, or Analytics.

CREATE KEYSPACE test
    WITH  REPLICATION= { 'class' :  'NetworkTopologyStrategy',  'Analytics' : 1 };

To define the number of replicas for a multiple data center cluster:

CREATE KEYSPACE test2
    WITH REPLICATION= { 'class' :  'NetworkTopologyStrategy',  'dc1' : 3,  'dc2' : 3 };

When creating the keyspace, what you name your data centers depends on the snitch you have chosen for your cluster. The data center names must correlate to the snitch you are using in order for replicas to be placed in the correct location.

As a general rule, the number of replicas should not exceed the number of nodes in a replication group. However, it is possible to increase the number of replicas, and then add the desired number of nodes afterwards. When the replication factor exceeds the number of nodes, writes will be rejected, but reads will still be served as long as the desired consistency level can be met.

The default consistency level is QUORUM.

Changing replication settings 

The default replication of 1 for keyspaces is suitable only for development and testing of a single node. For production environments, it is important to change the replication of keyspaces from 1 to a higher number. To avoid operations problems, changing the replication of these system keyspaces is especially important for:

  • HiveMetaStore
  • cfs
  • cfs_archive

If the node is an Analytics node that uses Hive, increase the HiveMetaStore and cfs keyspace replication factors to 2 or higher to be resilient to single-node failures. If you use cfs_archive, increase it accordingly.

Procedure

To change the replication keyspaces:

  1. Check the name of the data center of the node:
    • Packaged installs: nodetool status
    • Tarball installs: install_location/bin/nodetool status

    The output tells you the name of the data center for the node, for example, datacenter1.

  2. Change the replication of the cfs and cfs_archive keyspaces from 1 to 3, for example:
    ALTER KEYSPACE cfs
        WITH REPLICATION= {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};
    ALTER KEYSPACE cfs_archive
        WITH REPLICATION= {'class' : 'NetworkTopologyStrategy', 'dc1' : 3};
    How high you increase the replication depends on the number of nodes in the cluster, as discussed in the previous section.
  3. If you use Hive, update the HiveMetaStore keyspace to increase the replication from 1 to 3, for example.
  4. If the keyspaces you changed contain any data, run nodetool repair to avoid having missing data problems or data unavailable exceptions.