Single data center deployment per workload type
A deployment scenario with a mixed workload cluster has only one data center for each type of workload.
In this scenario, a mixed workload cluster has only one data center for each type of workload. For example, if the cluster has 3 Hadoop nodes, 3 Cassandra nodes, and 2 Solr nodes, the cluster has 3 data centers, one for each type of workload. In contrast, a multiple data-center cluster has more than one data center for each type of workload.
In Cassandra, a data center can be a physical data center or
virtual data center. Different workloads should use separate data centers,
either physical or virtual. Using separate data centers prevents Cassandra
transactions from being impacted by other workloads and keeps requests close to
each other for lower latency. Replication is set by data center amd depending on
the replication factor, data can be written to multiple data centers. However,
data centers should never span physical locations. In a single data center
deployment, data is replicated within its data center. For more information about
replication:
- Data replication
- Choosing keyspace replication options
- Replication in a physical or virtual data center (Applies only to the single-token-per-node architecture.)
Prerequisites
- A good understanding of how Cassandra works. Be sure to read at least Understanding the architecture, Data Replication, and Cassandra's rack feature.
- DataStax Enterprise is installed on each node.
- Choose a name for the cluster.
- For a mixed-workload cluster, determine the purpose of each node.
- Determine the snitch and replication strategy. The GossipingPropertyFileSnitch and NetworkTopologyStrategy are recommended for production environments.
- Get the IP address of each node.
- Determine which nodes are seed nodes. Do not make all nodes seed nodes. Please read Internode communications (gossip).
- Other possible configuration settings are described in the cassandra.yaml configuration file and property files such as cassandra-rackdc.properties.
- Set virtual nodes correctly for the type of data center. DataStax recommends using virtual nodes only on data centers running Cassandra real-time workloads. See Virtual nodes.
Procedure
This configuration example describes installing an 8 node cluster spanning 2 racks in a single data center. The default consistency level is QUORUM.
Results
Datacenter: Cassandra ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 110.82.155.0 21.33 KB 256 33.3% a9fa31c7-f3c0-... RAC1 UN 110.82.155.1 21.33 KB 256 33.3% f5bb416c-db51-... RAC1 UN 110.82.155.2 21.33 KB 256 16.7% b836748f-c94f-... RAC1 Datacenter: Analytics ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Tokens Rack UN 110.82.155.3 28.44 KB 13.0.% e2451cdf-f070- ... -922337.... RAC1 UN 110.82.155.4 44.47 KB 16.7% f9fa427c-a2c5- ... 30745512... RAC1 UN 110.82.155.5 54.33 KB 23.6% b9fc31c7-3bc0- ..- 45674488... RAC1 Datacenter: Solr ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Tokens Rack UN 110.82.155.6 15.44 KB 50.2.% e2451cdf-f070- ... 9243578.... RAC1 UN 110.82.155.7 18.78 KB 49.8.% e2451cdf-f070- ... 10000 RAC1