A deployment scenario for a Cassandra cluster with a single data center.
This topic contains information for deploying a Cassandra cluster with a single data
center.
Prerequisites
Each node must be correctly configured before starting the cluster.
You must determine or perform the following before starting the cluster:
- Install Cassandra on each node.
- Choose a name for the cluster.
- Get the IP address of each node.
- Determine which nodes will be seed nodes. (Cassandra nodes use the seed node
list for finding each other and learning the topology of the ring.)
- Determine the snitch.
- If using multiple data centers, determine a naming convention for each data
center and rack, for example: DC1, DC2 or 100, 200 and RAC1, RAC2 or R101,
R102.
- Other possible configuration settings are described in The cassandra.yaml configuration file.
This example describes installing a six node cluster spanning two racks in a single
data center. Each node is configured to use the RackInferringSnitch (multiple rack
aware) and 256 virtual nodes (vnodes). It is recommended to have more than one
seed node per data center.
In Cassandra, the term data center is a grouping of nodes. Data
center is synonymous with replication group, that is, a grouping of nodes configured
together for replication purposes.
Procedure
-
Suppose you install Cassandra on these nodes:
node0 110.82.155.0 (seed1)
node1 110.82.155.1
node2 110.82.155.2
node3 110.82.156.3 (seed2)
node4 110.82.156.4
node5 110.82.156.5
It is a best practice to have more than one seed node
per data center.
-
If you have a firewall running on the nodes in your cluster, you must open
certain ports to allow communication between the nodes. See Configuring firewall port access.
-
If Cassandra is running:
-
Stop Cassandra:
$ sudo service cassandra stop
-
Clear the data:
$ sudo rm -rf /var/lib/cassandra/*
Tarball installs:
-
Stop Cassandra:
$ ps auwx | grep cassandra
$ sudo kill <pid>
-
Clear the data:
$ cd <install_location>
$ sudo rm -rf /var/lib/cassandra/*
-
Modify the following property settings in the
cassandra.yaml file for each node:
- num_tokens: <recommended value: 256>
- -seeds: <internal IP address of each seed
node>
- listen_address: <localhost IP address>
- endpoint_snitch: <name of snitch> (See endpoint_snitch.)
- auto_bootstrap: false (Add this setting only
when initializing a fresh cluster with no data.)
node0
cluster_name: 'MyDemoCluster'
num_tokens: 256
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "110.82.155.0,110.82.155.3"
listen_address: 110.82.155.0
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch
node1 to node5
The properties for these nodes are the same as node0 except for the
listen_address.
-
After you have installed and configured Cassandra on all nodes, start the seed
nodes one at a time, and then start the rest of the nodes.
If the node has restarted because of automatic restart, you must stop the
node and clear the data directories, as described above.
For packaged installs, run the following command:
$ sudo service cassandra start
For binary installs, run the following commands:
$ cd <install_location>
$ bin/cassandra
-
To check that the ring is up and running, run the nodetool
status command.