Control plane and data plane operators

This topic explains how operators interact with the control plane and data plane in Mission Control, including their roles, responsibilities, and operational boundaries.

Mission Control uses a multi-operator architecture, where different operators manage distinct aspects of the platform and database lifecycle. It’s helpful to understand the relationship between these operators and the control/data plane separation for effective platform management, troubleshooting, and architectural planning.

Architecture layers

Mission Control operates across two distinct architectural layers:

Control plane: This centralized management layer orchestrates database operations across all managed clusters. The control plane includes the Mission Control UI, API, and core operators that manage cluster lifecycle, configuration, and operations.
Data plane: This distributed execution layer runs database workloads. Each data plane consists of Kubernetes clusters that host database nodes, along with local operators that manage the database resources within those clusters.

Both the control plane and data plane include operators that provide automation capabilities. Operators listen for changes in custom resources and reconcile the desired state with the actual state of the system.

Operator types and responsibilities

Mission Control uses three primary operators, each with distinct responsibilities. These operators are bundled with Mission Control and work together as a unified system. You don’t need to install or manage them separately.

Operator types and responsibilities
Operator	Location	Primary responsibilities	Scope	Key characteristics
`mission-control-operator`	Control plane	Issues and rotates certificates Manages `MissionControlCluster` custom resources Orchestrates cluster lifecycle operations across multiple Kubernetes clusters Coordinates multi-datacenter operations Manages cluster-level configuration and policies Interfaces with the Mission Control UI and API	Global across all managed clusters	Runs in the Mission Control control plane namespace Provides visibility across all managed data planes Orchestrates operations but delegates execution to data plane operators Maintains the desired state for all managed database clusters
`cass-operator`	Data plane (one per managed Kubernetes cluster)	Manages `CassandraDatacenter` custom resources Controls the lifecycle of individual database nodes Handles pod creation, scaling, and deletion Manages `StatefulSets` for database racks Performs node-level operations (bootstrap, decommission, replace) Monitors node health and status	Local to a single Kubernetes cluster	Runs in each data plane cluster Directly manages database pods and resources Executes operations that the control plane initiates Reports status back to the control plane through custom resource status fields
`k8ssandra-operator`	Data plane (one per managed Kubernetes cluster)	Manages `K8ssandraCluster` custom resources Coordinates multi-datacenter cluster configurations Manages Medusa (backup/restore) components Manages Reaper (repair) components Handles Stargate API gateway deployments Orchestrates datacenter-level operations	Local to a single Kubernetes cluster, with awareness of multi-datacenter topology	Runs in each data plane cluster Bridges cluster-level and datacenter-level operations Manages auxiliary services like Medusa, Reaper, and Stargate Coordinates with `cass-operator` for datacenter management

Control plane and data plane interaction

The control plane and data plane use a request-response pattern for all operations, whether creating clusters, running backups, performing repairs, or scaling nodes. For example, when you trigger a backup through the UI, the control plane creates a K8ssandraTask CR in the data plane, data plane operators execute the backup using Medusa, and the operators send status updates back through the CR to the control plane. Cluster creation follows a similar but more complex flow because it creates the foundational K8ssandraCluster CR that defines the entire database topology.

Request flow

The request flow consists of the following steps:

Control plane: A user initiates an operation through the Mission Control UI or API.
Control plane: The mission-control-operator validates and plans the operation.
Control plane: The control plane creates or updates CRs. The operator writes the desired state to data plane custom resources.
Data plane: Data plane operators, k8ssandra-operator and cass-operator, listen for CR updates from the control plane operator.
Data plane: Local operators execute the operation.
Data plane: Data plane operators update CR status fields, propagating the status back to the control plane.
Control plane: The mission-control-operator tracks operation status until the operation ends (success or failure).

Example cluster creation flow

The cluster creation flow consists of the following steps:

Control plane: A user initiates cluster creation through the UI.
Control plane: The mission-control-operator receives the request.
Control plane: The mission-control-operator creates a K8ssandraCluster CR in the data plane.
Data plane: The k8ssandra-operator detects the new K8ssandraCluster.
Data plane: The k8ssandra-operator creates CassandraDatacenter CRs.
Data plane: The cass-operator detects the new CassandraDatacenter.
Data plane: The cass-operator creates StatefulSets for database racks.
Kubernetes: Kubernetes creates pods and storage.
Data plane: The cass-operator bootstraps database nodes.
Data plane: Status updates flow back through CRs to the control plane.
Control plane: The mission-control-operator updates the cluster status as the operation progresses.
Control plane: When the operation completes successfully, the UI displays the cluster as ready.

Operational boundaries

Understand the boundaries between operators to clarify responsibilities and troubleshooting paths.

Control plane boundaries: The control plane manages cluster-level configuration and policies, multi-datacenter coordination, user authentication and authorization, observability aggregation including metrics, logs, and alerts, backup and restore scheduling and coordination, and cluster lifecycle orchestration.

The control plane does not manage individual pod creation or deletion, direct database node operations, local Kubernetes resource management, node-level health monitoring, or direct database configuration changes.
Data plane boundaries: The data plane manages database pod lifecycle, StatefulSet management, node-level operations such as bootstrap, decommission, and replace, local resource allocation, database configuration application, node health monitoring, and local backup and restore execution.

The data plane does not manage cross-cluster coordination, global policy enforcement, user authentication, centralized observability, or multi-datacenter orchestration.

Communication patterns

Mission Control operators communicate using a declarative model and custom resources as the API boundary between the control plane and data planes.

Mission Control uses a declarative model where the control plane declares desired state in custom resources, and data plane operators continuously reconcile actual state to match desired state. Status information flows back through custom resource status fields, and no direct RPC or API calls occur between control and data plane operators.

Custom resources serve as the API boundary between control and data planes:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo-cluster
spec:
  # Control plane sets desired state
  cassandra:
    serverVersion: "4.0.7"
    datacenters:
      - metadata:
          name: dc1
        size: 3
status:
  # Data plane reports actual state
  datacenters:
    - name: dc1
      readyReplicas: 3
      conditions:
        - type: Ready
          status: "True"

Multi-region considerations

In multi-region deployments, each region has its own data plane, but you can only deploy the control plane in a single region. Operators in each region operate independently, and cross-region coordination occurs through the control plane. Network partitions between regions do not affect local operations.

For more information, see Configure a multi-region Mission Control environment.

Failure scenarios and isolation

Understanding how failures affect different layers helps you plan for high availability and disaster recovery.

Control plane failure

During a control plane failure, you cannot initiate new operations, but existing clusters continue to run normally. Data plane operators continue reconciliation and database operations remain unaffected.

To recover, restore the control plane independently. No data loss occurs in database clusters, and operations resume when the control plane becomes available.

Data plane failure

During a data plane failure, only the specific data plane cluster experiences issues while other data planes continue operating normally. The control plane remains operational and you can initiate operations on healthy data planes.

To recover, restore the data plane independently. The control plane maintains desired state and operators reconcile to desired state after recovery.

Operator failure

During an operator failure, the specific operator instance is unavailable, but the cluster continues to operate using the remaining operators.

mission-control-operator failure: You cannot initiate new cluster operations, but existing clusters remain unaffected and data plane operators continue normal operation.
cass-operator failure: You cannot perform node-level operations in the affected cluster through the API. You can interact with nodes using kubectl exec and nodetool. Existing nodes continue running and other clusters remain unaffected.
k8ssandra-operator failure: You cannot perform datacenter-level operations in the affected cluster, but existing datacenters continue running and other clusters remain unaffected.

Troubleshoot operator issues

Use these troubleshooting techniques to diagnose and resolve issues across operator layers.

Identify the operator layer

When you troubleshoot issues, identify which operator layer is involved, and then investigate the logs and other details related to that operator and failure.

The following table is a starting point for investigating some common issues:

Symptom Likely operator Investigation path

Symptom	Likely operator	Investigation path
Cannot create new clusters through UI	`mission-control-operator`	Check control plane operator logs and CR status
Cluster creation stuck	`k8ssandra-operator`	Check the data plane `K8ssandraCluster` CR and operator logs
Pods not starting	`cass-operator`	Check CassandraDatacenter CR and `StatefulSet` status
Node operations failing	`cass-operator`	Check node-level logs and operator reconciliation
Backup or restore issues	`k8ssandra-operator` and Medusa	Check the K8ssandraCluster CR and Medusa logs
Multi-DC coordination issues	`mission-control-operator` and `k8ssandra-operator`	Check both control and data plane operator logs
Operator version mismatch	All operators	Verify operator versions match the Mission Control release
Migration issues with existing clusters	`k8ssandra-operator` and `cass-operator`	Check that operators only manage assigned resources

Cannot create new clusters through UI

mission-control-operator

Check control plane operator logs and CR status

Cluster creation stuck

k8ssandra-operator

Check the data plane K8ssandraCluster CR and operator logs

Pods not starting

cass-operator

Check CassandraDatacenter CR and StatefulSet status

Node operations failing

cass-operator

Check node-level logs and operator reconciliation

Backup or restore issues

k8ssandra-operator and Medusa

Check the K8ssandraCluster CR and Medusa logs

Multi-DC coordination issues

mission-control-operator and k8ssandra-operator

Check both control and data plane operator logs

Operator version mismatch

All operators

Verify operator versions match the Mission Control release

Migration issues with existing clusters

k8ssandra-operator and cass-operator

Check that operators only manage assigned resources

Check custom resources

Custom resources provide the most direct insight into operator state:

Check the MissionControlCluster in the control plane

kubectl get missioncontrolcluster -n **NAMESPACE**

Replace NAMESPACE with the namespace of your control plane.

kubectl describe missioncontrolcluster CLUSTER_NAME -n NAMESPACE

Replace the following:

CLUSTER_NAME: The name of your cluster
NAMESPACE: The namespace of your control plane

Check the K8ssandraCluster in the data plane

kubectl get k8ssandracluster -n NAMESPACE

Replace NAMESPACE with the namespace of your data plane.

kubectl describe k8ssandracluster CLUSTER_NAME -n NAMESPACE

Replace the following:

CLUSTER_NAME: The name of your cluster
NAMESPACE: The namespace of your data plane

Check the CassandraDatacenter in the data plane

kubectl get cassandradatacenter -n NAMESPACE

Replace NAMESPACE with the namespace of your data plane.

kubectl describe cassandradatacenter DC_NAME -n NAMESPACE

Replace the following:

DC_NAME: The name of your datacenter
NAMESPACE: The namespace of your data plane

Check operator logs

Operator logs show reconciliation activity and errors:

Control plane operator logs

kubectl logs -n NAMESPACE deployment/mission-control-operator

Replace NAMESPACE with the namespace of your control plane.

Data plane operator logs

kubectl logs -n NAMESPACE deployment/cass-operator

Replace NAMESPACE with the namespace of your data plane.

kubectl logs -n NAMESPACE deployment/k8ssandra-operator

Replace NAMESPACE with the namespace of your data plane.

Best practices

Follow these best practices to ensure reliable operation of Mission Control across the control plane and data planes.

Separation of concerns: Use the control plane for orchestration and policy while letting data plane operators handle execution. Avoid direct manipulation of data plane resources from the control plane, and instead use custom resources as the interface between layers. This separation ensures clear boundaries and maintainable operations.
Monitoring and observability: Monitor operator health in both control and data planes, and track custom resource status conditions to understand system state. Set up alerts for operator failures to enable quick response to issues, and monitor reconciliation loops for stuck operations that may indicate problems requiring intervention.
Resource management: Ensure adequate resources for operators in both planes to maintain reliable operation. Scale operator replicas based on cluster count to handle increased load, and monitor operator memory and CPU usage to identify resource constraints. Plan for operator overhead in capacity planning to avoid resource exhaustion.
Upgrade coordination: Upgrade control plane operators first to ensure compatibility with existing data planes. Verify control plane stability before you upgrade data plane operators to minimize risk of cascading failures. Upgrade data plane operators one cluster at a time to limit blast radius, and test upgrades in non-production environments first to identify potential issues before production deployment.

Operator versioning and compatibility

Mission Control bundles the K8ssandra operator, which includes the cass-operator for managing Cassandra, DSE, and HCD datacenters. The Mission Control operator extends K8ssandra with additional capabilities including certificate rotation, ingress management, and multi-cluster orchestration.

Operator compatibility matrix

The following table shows the operator versions included in recent Mission Control releases:

Mission Control version	`mission-control-operator`	`k8ssandra-operator`	`cass-operator`
1.19.1	1.19.1	1.32.3	v1.30.2
1.19.0	1.19.0	1.32.2	v1.30.2
1.18.2	1.18.2	1.32.1	v1.30.0
1.18.0	1.18.0	1.31.0	v1.30.0
1.17.2	1.17.2	1.30.2	v1.28.1
1.17.1	1.17.1	1.30.2	v1.28.1

For a complete list of operator versions across all releases, see the Mission Control release notes.

Operator upgrade behavior

Operator upgrades do not trigger rolling restarts of your database clusters. When you upgrade Mission Control, the operators update independently of the database nodes. Your database clusters continue running with their existing configuration until you explicitly modify cluster specifications or perform operations that require node changes.

This separation ensures that:

Platform upgrades don’t cause database downtime.
You control when database changes occur.
Operator improvements deploy without affecting running workloads.
Database operations remain stable during platform maintenance.

Version compatibility during migration

When you migrate existing Cassandra, DSE, or HCD clusters to Mission Control, the operators manage the new infrastructure while your existing clusters continue running. The operators don’t interfere with clusters outside their management scope.

During migration:

Existing clusters run independently until you explicitly add them to Mission Control.
The operators only manage resources they create or that you explicitly assign to them.
You control the migration timeline and can run both environments in parallel.
No automatic changes occur to existing database configurations.

For migration guidance, see Migrate clusters to Mission Control.

Control plane and data plane operators

Architecture layers

Operator types and responsibilities

Control plane and data plane interaction

Request flow

Example cluster creation flow

Operational boundaries

Communication patterns

Multi-region considerations

Failure scenarios and isolation

Control plane failure

Data plane failure

Operator failure

Troubleshoot operator issues

Identify the operator layer

Check custom resources

Check operator logs

Best practices

Operator versioning and compatibility

Operator compatibility matrix

Operator upgrade behavior

Version compatibility during migration

See also

Was this helpful?

Give Feedback