Use Cassandra or DSE in Kubernetes with Kubernetes Operator for Apache Cassandra
This topic explains how to use your configured and provisioned Apache Cassandra® or DSE cluster in Kubernetes.
Prerequisites
-
An existing Kubernetes (K8s) cluster.
If you have not already, Create a Kubernetes cluster.
-
Application of the example operator configuration, as shown in Configure Cassandra or DSE in Kubernetes with Kubernetes Operator for Apache Cassandra.
-
Successful completion of the steps to provision and deploy a Cassandra or DSE cluster in your existing Kubernetes environment.
Connecting from inside the Kubernetes cluster
For an example of invoking cqlsh
from inside a Kubernetes cluster, refer to Connect to Cassandra with cqlsh within Kubernetes cluster.
Kubernetes Operator for Apache Cassandra makes a Kubernetes headless service available at <clusterName>-<datacenterName>-service
. Any client app that submits CQL commands inside the Kubernetes cluster needs to connect to this service and use the nodes in a round-robin fashion as contact points.
When you use the DataStax Java driver in an app, and prepare to connect the driver to a DSE cluster from within the Kubernetes cluster, there are two overall choices. The Java driver accepts either:
-
multiple Inet[Socket]Address parameters to connect.
-
one Inet[Socket]Address parameter to connect. In this case the Java driver uses it as the control connection, and talks directly to the cluster to discover the other nodes and to connect to them.
However, with version 4.x of the DataStax Java driver, when you specify a hostname in the contact points
definition in the config file, the Java driver first resolves every host associated with the hostname. In the case of the cluster deployed by the Kubernetes Operator for Apache Cassandra, using the hostname of the Kubernetes service (example: cluster1-dc1-service
) resolves to every IP address associated with the DSE cluster; that is, all of the IPs of the DSE nodes. The DataStax Java driver chooses one of those nodes as the control connection, connects to the other resolved nodes, and performs cluster verification to all of its connected local DSE nodes.
For example, if you are programmatically configuring the DataStax Java driver, your app can use one of the following:
-
InetAddress.getByName
(cluster1-dc1-service
) to resolve only one host, which the driver uses at init time, and then connect to DSE to discover the rest of the nodes. -
InetAddress.getAllByName
(cluster1-dc1-service
), which resolves all the nodes directly. The driver uses this setting as if you specified the multiple IP addresses of the nodes in the contact points.
Connecting from outside the Kubernetes cluster
When applications run within a Kubernetes cluster, you must access those services from outside the cluster. Connecting to a Cassandra cluster running within Kubernetes can range from trivial to complex, and is dependent on where the client is running, on latency requirements, and on security requirements. See Connect to Cassandra and apps from outside the Kubernetes cluster.
Scaling up the datacenter
The size
parameter on the CassandaDatacenter determines how many Cassandra or DSE instances are present in the datacenter. To add more nodes, edit the YAML file as described in the steps of the provisioning topic. Then reapply the CassandaDatacenter configuration using the same command as shown in that topic:
kubectl -n my-db-ns apply -f ./cluster1-dc1.yaml
When you reapply the YAML configuration file with the additional nodes defined, Kubernetes Operator for Apache Cassandra restarts and Kubernetes adds the pods to your datacenter, provided there are sufficient Kubernetes worker nodes available.
As part of the scaling up process, each rack in the Kubernetes cluster must contain the same number of server instances. |
Changing the server configuration
To change the Cassandra or DSE configuration, update the CassandaDatacenter parameter and edit the config
section of the spec
key. Then reapply the CassandaDatacenter
configuration using this command:
kubectl -n my-db-ns apply -f ./cluster1-dc1.yaml
Kubernetes Operator for Apache Cassandra updates the configuration and restarts one node at a time in a rolling fashion. |
Establishing a multi-datacenter cluster
To make a multi-datacenter cluster, create two CassandaDatacenter
resources in the spec
and give them the same clusterName
.
However, multi-region clusters and advanced workloads are not supported, which makes many multi-datacenter use cases inappropriate for Kubernetes Operator for Apache Cassandra. |
Using kubectl
to monitor resources in the Kubernetes cluster.
Use kubectl
commands to get more information about the Cassandra or DSE pods running in the Kubernetes cluster.
-
To get information about ongoing or recent events:
kubectl get event --all-namespaces
By default, each event is configured by Kubernetes to only have a one hour Time to Live (TTL).
-
To check for errors in the Kubernetes log for your operator’s instance, use
kubectl
logs. First, get the instance name by using thekubectl get pod
command and specifying your namespace.For example:
kubectl -n my-db-ns get pod
Sample output:
NAME READY STATUS RESTARTS AGE cass-operator-f74447c57-kdf2p 1/1 Running 0 13m gke-cluster1-dc1-r1-sts-0 1/1 Running 0 5m38s gke-cluster1-dc1-r2-sts-0 1/1 Running 0 42s gke-cluster1-dc1-r3-sts-0 1/1 Running 0 6m7s
-
Then use
kubectl
logs. The log entries may be large; consider writing the output to a file.For example:
kubectl -n my-db-ns logs cass-operator-f74447c57-kdf2p > ~/cass-operator-log.txt
To tail the Cassandra or DSE logs, use a command such as:
kubectl -n my-db-ns logs --container server-system-logger --follow gke-cluster1-dc1-r1-sts-0
-
You can also use the
kubectl describe pod
command to get identifying information about your pod.For example:
kubectl -n my-db-ns describe pod cass-operator-f74447c57-kdf2p
Sample output:
Name: cass-operator-f74447c57-kdf2p Namespace: my-db-ns Priority: 0 Node: ip-10-101-34-70.srv101.myinternal.org/10.101.34.70 Start Time: Wed, 26 May 2021 23:39:42 -0600 Labels: name=cass-operator pod-template-hash=f74447c57 Annotations: <none> Status: Running IP: 10.244.2.2 IPs: IP: 10.244.2.2 Controlled By: ReplicaSet/cass-operator-f74447c57 Containers: dse-operator: Container ID: docker://bacfba382ed6be8893a0c344089d40fbb6c36db93a3e3677464390dd358fef35 Image: datastax/cass-operator:1.7.1-20210526 Image ID: docker-pullable://datastax/cass-operator@sha256:4e80f26c54594133a99adefc9e2e7e9b2b5915788d8c6b24457407e2d470a36a Port: <none> Host Port: <none> State: Running Started: Wed, 26 May 2021 23:39:51 -0600 Ready: True Restart Count: 0 Environment: WATCH_NAMESPACE: my-db-ns (v1:metadata.namespace) POD_NAME: cass-operator-f74447c57-kdf2p (v1:metadata.name) OPERATOR_NAME: cass-operator Mounts: /var/run/secrets/kubernetes.io/serviceaccount from cass-operator-token-q9hq5 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: cass-operator-token-q9hq5: Type: Secret (a volume populated by a Secret) SecretName: cass-operator-token-q9hq5 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none>
What’s next
Learn how to use the metric reporter dashboards for Cassandra or DSE clusters in Kubernetes.