Connect to Cassandra and apps from outside the Kubernetes cluster

When applications run within a Kubernetes cluster, you need to access those services from outside the cluster.

The process to connect to a Apache Cassandra® cluster running within Kubernetes can range from trivial to complex depending on where the client is running, latency requirements, and security requirements.

This topic describes how to connect to Cassandra resources and applications running in Kubernetes from outside the cluster. It is assumed that your Cassandra cluster is already up and reported as running.

Pod access

Any pod running within a Kubernetes cluster may communicate with any other pod, provided the container network policies permit it. Most communication and service discovery within a Kubernetes cluster is not an issue.

Network supported direct access

One method for communicating with Cassandra pods involves having Kubernetes run in an environment where the pod network address space is known and advertised, with routes at the network layer. In these types of environments, Border Gateway Protocol (BGP) and static routes may be defined at layer 3 in the Open Systems Interconnection (OSI) model.

This allows for IP connectivity routing directly to pods and services running within Kubernetes from both inside and outside the cluster. Additionally, this approach allows for the consumption of service addresses externally. Unfortunately, this requires an advanced understanding of both Kubernetes networking and the infrastructure available within the enterprise or cloud where it is hosted.
- Pros: Zero additional configuration within the application; works inside and outside of the Kubernetes network.
- Cons: Requires configuration at the networking layer within the cloud / enterprise environment; not all environments can support this approach. Some cloud environments do not have the tooling exposed for users to enable this functionality.
Host network configuration

Host Network configuration exposes all network interfaces to the underlying pod instead of a single virtual interface.

This allows Cassandra to bind on the worker’s interface with an externally accessible IP. Any container that is launched as part of the pod has access to the host’s interface; it cannot be fenced off to a specific container.

To enable this behavior, pass hostNetwork: true in the podTemplateSpec at the top level.
- Pros: External connectivity is possible as the service is available at the node’s IP instead of an IP internal to the Kubernetes cluster.
- Cons:
  - If a pod is rescheduled them the IP address of the pod can change.
  - In some Kubernetes distributions this configuration is a privileged operation.
  - Additional automation would be required to identify the appropriate IP and set it for listen_address and broadcast_address.
  - Only one Cassandra pod may be started per worker, regardless of the allowMultiplePodsPerWorker setting.
Host Port configuration

Host Port configuration is similar to host network, but instead of being applied at the pod level, Host Port is applied to specified containers within the pod. For each port listed in the container’s block, a hostPort: external_port key value is included. The external_port is the port number on the Kubernetes worker that should be forwarded to this container’s port. Cassandra Operator doesn’t allow modifying the cassandra container with podTemplateSpec. Configuring this value isn’t possible without patching each rack’s stateful set.
- Pros: External connectivity is possible as the service is available at the nodes IP instead of an IP internal to the Kubernetes cluster. It is easier to configure because you don’t need a separate container to determine the appropriate IP.
- Cons:
  - If a pod is rescheduled then the IP address of the pod can change.
  - In some Kubernetes distributions this configuration is a privileged operation.
  - Only one Cassandra pod may be started per worker, regardless of the allowMultiplePodsPerWorker setting.
  - Not recommended according to Kubernetes Configuration Best Practices.

Services exposed by the Kubernetes Operator for Apache Cassandra

If the application is running within the same Kubernetes cluster as the Cassandra cluster, connectivity is straightforward. The Cassandra Operator exposes a number of services representing a Cassandra cluster, datacenters, and seeds. Applications running within the same Kubernetes cluster may leverage these services to discover and identify pods within the target Cassandra cluster.

Unlike internal apps, external apps do not have access to this information through DNS. It is possible to forward DNS requests to Kubernetes from outside the cluster and resolve configured services. Unfortunately, this approach provides the internal pod IP addresses and not those routable unless Network Supported Direct Access is possible within the environment. In most scenarios, external applications are not able to leverage the exposed services from the Cassandra Operator.

Exposing a Load Balancer service

It is possible to configure a service within Kubernetes outside of those provided by the Cassandra Operator that is accessible from outside of the Kubernetes cluster. These services have a type: LoadBalancer key in the spec: block. In most cloud environments, this configuration results in a native cloud load balancer being provisioned to point at the appropriate pods with an external IP. Once the load balancer is provisioned running, kubectl get svc displays the external IP address that is pointed at the Cassandra nodes.

Pros: Available from outside the Kubernetes cluster.
Cons:
- Requires use of an AddressTranslator client side to restrict attempts by the drivers to connect directly with pods, and instead to direct connections to the load balancer.
- Removes the possibility of a TokenAwarePolicy Load Balancing Policy (LBP).
- Does not support Transport Layer Security (TLS) termination at the service layer, but rather within the application.

Ingress

For Ingress examples, see the Cassandra Operator GitHub repository.

Ingress is a feature that forwards requests to services running within a Kubernetes cluster based on rules. These rules may include specifying the protocol, the port, or even the path. Rules may provide additional functionality such as termination of SSL or TLS traffic, load balancing across a number of protocols, and name-based virtual hosting.

Behind the Ingress Kubernetes type is an Ingress Controller. There are a number of controllers available with varying features to service the defined ingress rules. Think of Ingress as an interface for routing and an Ingress Controller as the implementation of that interface. In this way, any number of Ingress Controllers may be used based on the workload requirements.

Ingress Controllers function at Layer 4 and 7 of the OSI model. At creation, the Ingress specification focused specifically on HTTP or HTTPS workloads. From the documentation: "An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer."

Cassandra workloads do not use HTTP as a protocol, but rather a specific TCP protocol. Leveraging Ingress Controllers requires support for TCP load balancing. This approach provides routing semantics similar to those of LoadBalancer Services.

If the Ingress Controller also supports SSL termination with Server Name Indication (SNI), then secure access is possible from outside the cluster while keeping Token Aware routing support. Additionally, operators should consider whether the chosen Ingress Controller supports client SSL certificates, allowing for Mutual TLS to restrict access from unauthorized clients.

Pros:
- Highly-available entry point into the cluster.
- Some implementations support TCP load balancing.
- Some implementations support Mutual TLS (mTLS).
- Some implementations support SNI.

Cons:

No standard implementation. Requires careful selection.

Initially designed for HTTP or HTTPS only workloads.

Many ingresses support pure TCP workloads, but it is not defined in the original design specification. Some configurations require intensive templating of base configuration files. This can make it more difficult to upgrade those components.

Only some implementations support TCP load balancing.
Only some implementations support mTLS.
Only some implementations support SNI with TCP workloads.

Kong as an Ingress

Kong is open source API gateway. Built for multi-cloud and hybrid, Kong is optimized for microservices and distributed architectures. Kong does not have to be deployed on Kubernetes supporting a multitude of environments. The DataStax GitHub-hosted sample installs Kong as an Ingress for a Kubernetes cluster.

For examples, see the following:

Traefik as an Ingress Controller

Traefik is an open-source Edge Router that is designed to work in a number of environments, and not just Kubernetes. When running on Kubernetes, Traefik is generally installed as an Ingress Controller. Traefik supports TCP load balancing along with SSL termination and SNI. It is automatically included as the default Ingress Controller of K3s and K3d.

For examples, see the following:

Connect to Cassandra and apps from outside the Kubernetes cluster

Pod access

Services exposed by the Kubernetes Operator for Apache Cassandra

Exposing a Load Balancer service

Ingress

Kong as an Ingress

Traefik as an Ingress Controller

Sample Java driver configurations for Ingress

Sample `CassandraDatacenter` reference for Ingress

SSL certificate generation for Ingress

See also

Was this helpful?

Give Feedback