Control plane and data plane operators

This topic explains how operators interact with the control plane and data plane in Mission Control, including their roles, responsibilities, and operational boundaries.

Mission Control uses a multi-operator architecture, where different operators manage distinct aspects of the platform and database lifecycle. It’s helpful to understand the relationship between these operators and the control/data plane separation for effective platform management, troubleshooting, and architectural planning.

Architecture layers

Mission Control operates across two distinct architectural layers:

  • Control plane: This centralized management layer orchestrates database operations across all managed clusters. The control plane includes the Mission Control UI, API, and core operators that manage cluster lifecycle, configuration, and operations.

  • Data plane: This distributed execution layer runs database workloads. Each data plane consists of Kubernetes clusters that host database nodes, along with local operators that manage the database resources within those clusters.

Both the control plane and data plane include operators that provide automation capabilities. Operators listen for changes in custom resources and reconcile the desired state with the actual state of the system.

Operator types and responsibilities

Mission Control uses three primary operators, each with distinct responsibilities:

Operator types and responsibilities
Operator Location Primary responsibilities Scope Key characteristics

mission-control-operator

Control plane

  • Issues and rotates certificates

  • Manages MissionControlCluster custom resources

  • Orchestrates cluster lifecycle operations across multiple Kubernetes clusters

  • Coordinates multi-datacenter operations

  • Manages cluster-level configuration and policies

  • Interfaces with the Mission Control UI and API

Global across all managed clusters

  • Runs in the Mission Control control plane namespace

  • Provides visibility across all managed data planes

  • Orchestrates operations but delegates execution to data plane operators

  • Maintains the desired state for all managed database clusters

cass-operator

Data plane (one per managed Kubernetes cluster)

  • Manages CassandraDatacenter custom resources

  • Controls the lifecycle of individual database nodes

  • Handles pod creation, scaling, and deletion

  • Manages StatefulSets for database racks

  • Performs node-level operations (bootstrap, decommission, replace)

  • Monitors node health and status

Local to a single Kubernetes cluster

  • Runs in each data plane cluster

  • Directly manages database pods and resources

  • Executes operations that the control plane initiates

  • Reports status back to the control plane through custom resource status fields

k8ssandra-operator

Data plane (one per managed Kubernetes cluster)

  • Manages K8ssandraCluster custom resources

  • Coordinates multi-datacenter cluster configurations

  • Manages Medusa (backup/restore) components

  • Manages Reaper (repair) components

  • Handles Stargate API gateway deployments

  • Orchestrates datacenter-level operations

Local to a single Kubernetes cluster, with awareness of multi-datacenter topology

  • Runs in each data plane cluster

  • Bridges cluster-level and datacenter-level operations

  • Manages auxiliary services like Medusa, Reaper, and Stargate

  • Coordinates with cass-operator for datacenter management

Control plane and data plane interaction

The control plane and data plane use a request-response pattern for all operations, whether creating clusters, running backups, performing repairs, or scaling nodes. For example, when you trigger a backup through the UI, the control plane creates a K8ssandraTask CR in the data plane, data plane operators execute the backup using Medusa, and the operators send status updates back through the CR to the control plane. Cluster creation follows a similar but more complex flow because it creates the foundational K8ssandraCluster CR that defines the entire database topology.

Request flow

The request flow consists of the following steps:

  1. Control plane: A user initiates an operation through the Mission Control UI or API.

  2. Control plane: The mission-control-operator validates and plans the operation.

  3. Control plane: The control plane creates or updates CRs. The operator writes the desired state to data plane custom resources.

  4. Data plane: Data plane operators, k8ssandra-operator and cass-operator, listen for CR updates from the control plane operator.

  5. Data plane: Local operators execute the operation.

  6. Data plane: Data plane operators update CR status fields, propagating the status back to the control plane.

  7. Control plane: The mission-control-operator tracks operation status until the operation ends (success or failure).

Example cluster creation flow

The cluster creation flow consists of the following steps:

  1. Control plane: A user initiates cluster creation through the UI.

  2. Control plane: The mission-control-operator receives the request.

  3. Control plane: The mission-control-operator creates a K8ssandraCluster CR in the data plane.

  4. Data plane: The k8ssandra-operator detects the new K8ssandraCluster.

  5. Data plane: The k8ssandra-operator creates CassandraDatacenter CRs.

  6. Data plane: The cass-operator detects the new CassandraDatacenter.

  7. Data plane: The cass-operator creates StatefulSets for database racks.

  8. Kubernetes: Kubernetes creates pods and storage.

  9. Data plane: The cass-operator bootstraps database nodes.

  10. Data plane: Status updates flow back through CRs to the control plane.

  11. Control plane: The mission-control-operator updates the cluster status as the operation progresses.

  12. Control plane: When the operation completes successfully, the UI displays the cluster as ready.

Operational boundaries

Understand the boundaries between operators to clarify responsibilities and troubleshooting paths.

Control plane boundaries

The control plane manages cluster-level configuration and policies, multi-datacenter coordination, user authentication and authorization, observability aggregation including metrics, logs, and alerts, backup and restore scheduling and coordination, and cluster lifecycle orchestration.

The control plane does not manage individual pod creation or deletion, direct database node operations, local Kubernetes resource management, node-level health monitoring, or direct database configuration changes.

Data plane boundaries

The data plane manages database pod lifecycle, StatefulSet management, node-level operations such as bootstrap, decommission, and replace, local resource allocation, database configuration application, node health monitoring, and local backup and restore execution.

The data plane does not manage cross-cluster coordination, global policy enforcement, user authentication, centralized observability, or multi-datacenter orchestration.

Communication patterns

Mission Control operators communicate using a declarative model and custom resources as the API boundary between the control plane and data planes.

Mission Control uses a declarative model where the control plane declares desired state in custom resources, and data plane operators continuously reconcile actual state to match desired state. Status information flows back through custom resource status fields, and no direct RPC or API calls occur between control and data plane operators.

Custom resources serve as the API boundary between control and data planes:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo-cluster
spec:
  # Control plane sets desired state
  cassandra:
    serverVersion: "4.0.7"
    datacenters:
      - metadata:
          name: dc1
        size: 3
status:
  # Data plane reports actual state
  datacenters:
    - name: dc1
      readyReplicas: 3
      conditions:
        - type: Ready
          status: "True"

Multi-region considerations

In multi-region deployments, each region has its own data plane, but you can only deploy the control plane in a single region. Operators in each region operate independently, and cross-region coordination occurs through the control plane. Network partitions between regions do not affect local operations.

Failure scenarios and isolation

Understanding how failures affect different layers helps you plan for high availability and disaster recovery.

Control plane failure

During a control plane failure, you cannot initiate new operations, but existing clusters continue to run normally. Data plane operators continue reconciliation and database operations remain unaffected.

To recover, restore the control plane independently. No data loss occurs in database clusters, and operations resume when the control plane becomes available.

Data plane failure

During a data plane failure, only the specific data plane cluster experiences issues while other data planes continue operating normally. The control plane remains operational and you can initiate operations on healthy data planes.

To recover, restore the data plane independently. The control plane maintains desired state and operators reconcile to desired state after recovery.

Operator failure

During an operator failure, the specific operator instance is unavailable, but the cluster continues to operate using the remaining operators.

  • mission-control-operator failure: You cannot initiate new cluster operations, but existing clusters remain unaffected and data plane operators continue normal operation.

  • cass-operator failure: You cannot perform node-level operations in the affected cluster through the API. You can interact with nodes using kubectl exec and nodetool. Existing nodes continue running and other clusters remain unaffected.

  • k8ssandra-operator failure: You cannot perform datacenter-level operations in the affected cluster, but existing datacenters continue running and other clusters remain unaffected.

Troubleshoot operator issues

Use these troubleshooting techniques to diagnose and resolve issues across operator layers.

Identify the operator layer

When you troubleshoot issues, identify which operator layer is involved, and then investigate the logs and other details related to that operator and failure.

The following table is a starting point for investigating some common issues:

Symptom Likely operator Investigation path

Cannot create new clusters through UI

mission-control-operator

Check control plane operator logs and CR status

Cluster creation stuck

k8ssandra-operator

Check the data plane K8ssandraCluster CR and operator logs

Pods not starting

cass-operator

Check CassandraDatacenter CR and StatefulSet status

Node operations failing

cass-operator

Check node-level logs and operator reconciliation

Backup or restore issues

k8ssandra-operator and Medusa

Check the K8ssandraCluster CR and Medusa logs

Multi-DC coordination issues

mission-control-operator and k8ssandra-operator

Check both control and data plane operator logs

Check custom resources

Custom resources provide the most direct insight into operator state:

Check the MissionControlCluster in the control plane
kubectl get missioncontrolcluster -n **NAMESPACE**

Replace NAMESPACE with the namespace of your control plane.

kubectl describe missioncontrolcluster CLUSTER_NAME -n NAMESPACE

Replace the following:

  • CLUSTER_NAME: The name of your cluster

  • NAMESPACE: The namespace of your control plane

Check the K8ssandraCluster in the data plane
kubectl get k8ssandracluster -n NAMESPACE

Replace NAMESPACE with the namespace of your data plane.

kubectl describe k8ssandracluster CLUSTER_NAME -n NAMESPACE

Replace the following:

  • CLUSTER_NAME: The name of your cluster

  • NAMESPACE: The namespace of your data plane

Check the CassandraDatacenter in the data plane
kubectl get cassandradatacenter -n NAMESPACE

Replace NAMESPACE with the namespace of your data plane.

kubectl describe cassandradatacenter DC_NAME -n NAMESPACE

Replace the following:

  • DC_NAME: The name of your datacenter

  • NAMESPACE: The namespace of your data plane

Check operator logs

Operator logs show reconciliation activity and errors:

Control plane operator logs
kubectl logs -n NAMESPACE deployment/mission-control-operator

Replace NAMESPACE with the namespace of your control plane.

Data plane operator logs
kubectl logs -n NAMESPACE deployment/cass-operator

Replace NAMESPACE with the namespace of your data plane.

kubectl logs -n NAMESPACE deployment/k8ssandra-operator

Replace NAMESPACE with the namespace of your data plane.

Best practices

Follow these best practices to ensure reliable operation of Mission Control across the control plane and data planes.

  • Separation of concerns: Use the control plane for orchestration and policy while letting data plane operators handle execution. Avoid direct manipulation of data plane resources from the control plane, and instead use custom resources as the interface between layers. This separation ensures clear boundaries and maintainable operations.

  • Monitoring and observability: Monitor operator health in both control and data planes, and track custom resource status conditions to understand system state. Set up alerts for operator failures to enable quick response to issues, and monitor reconciliation loops for stuck operations that may indicate problems requiring intervention.

  • Resource management: Ensure adequate resources for operators in both planes to maintain reliable operation. Scale operator replicas based on cluster count to handle increased load, and monitor operator memory and CPU usage to identify resource constraints. Plan for operator overhead in capacity planning to avoid resource exhaustion.

  • Upgrade coordination: Upgrade control plane operators first to ensure compatibility with existing data planes. Verify control plane stability before you upgrade data plane operators to minimize risk of cascading failures. Upgrade data plane operators one cluster at a time to limit blast radius, and test upgrades in non-production environments first to identify potential issues before production deployment.

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM