Configure pod resources for Mission Control components

This guide explains how to configure resource requests and limits for Mission Control platform components to ensure optimal performance and stability in your deployment. It provides baseline recommendations and scaling considerations for Mission Control components.

Resource requirements for Mission Control components vary based on several factors, including the number of managed database clusters, cluster sizes, concurrent operations, and workload characteristics. Start with the baseline recommendations in this guide, monitor actual resource usage through the Mission Control dashboards, and adjust based on observed patterns. Plan for peak load scenarios and consider growth over time when configuring your resources.

The recommendations in this guide assume production workloads. You can reduce these values for development or testing environments.

Control plane operators

The control plane operators manage the lifecycle of database clusters and their components. Each operator has specific resource requirements based on the number and complexity of managed resources.

mission-control-operator

The mission-control-operator manages custom resources and cluster operations. Configure the operator with baseline resources for normal operations and allow burst capacity during cluster operations like upgrades and scaling.

Recommended resources
Resource	Value	Notes
CPU request	100m	Baseline for normal operations
CPU limit	1000m	Allows burst during cluster operations
Memory request	256Mi	Baseline for normal operations
Memory limit	512Mi	Prevents excessive memory usage
Replicas	1	Single instance sufficient for most deployments

Increase resources when managing more than 10 database clusters, performing frequent cluster operations like upgrades or scaling, or experiencing reconciliation delays. For deployments with 20 or more clusters, increase the CPU limit to 2000m and memory limit to 1Gi.

cass-operator

The cass-operator manages the lifecycle of Apache Cassandra®, DataStax Enterprise (DSE), and Hyper-Converged Database (HCD) logical datacenters. This operator requires more resources than other operators due to its direct management of database node lifecycles.

Recommended resources
Resource	Value	Notes
CPU request	200m	Baseline for normal operations
CPU limit	1000m	Allows burst during reconciliation
Memory request	64Mi	Baseline for normal operations
Memory limit	256Mi	Provides overhead beyond the 128Mi kustomize default
Replicas	1	Single instance sufficient for most deployments

The operator detects new custom resources, generates StatefulSets, creates pods and storage resources, and performs bootstrap operations to reconcile database resources. The operator typically uses 200-500MB of memory in production environments. Increase resources when managing more than 5 database clusters, managing clusters with more than 10 nodes each, or performing frequent node operations like replacements or scaling. For deployments with 10 or more clusters, increase the CPU limit to 2000m and memory limit to 512Mi.

k8ssandra-operator

The k8ssandra-operator manages K8ssandra resources including cluster definitions, Reaper, and Medusa components.

Recommended resources
Resource	Value	Notes
CPU request	100m	Baseline for normal operations
CPU limit	1000m	Allows burst during cluster operations
Memory request	64Mi	Baseline for normal operations
Memory limit	256Mi	Provides overhead beyond the 128Mi kustomize default
Replicas	1	Single instance sufficient for most deployments

The operator typically uses 200-500MB of memory in production environments. Increase resources when managing more than 10 database clusters, performing frequent backup and restore operations, or running multiple concurrent repair operations. For deployments with 20 or more clusters, increase the CPU limit to 2000m and memory limit to 512Mi.

kube-state-metrics

The kube-state-metrics component monitors the state of Kubernetes resources and makes them available for monitoring and alerting.

Recommended resources
Resource	Value	Notes
CPU request	10m	Minimal baseline
CPU limit	200m	Sufficient for most deployments
Memory request	64Mi	Minimal baseline
Memory limit	128Mi	Sufficient for most deployments
Replicas	1	Single instance sufficient

Increase resources when managing very large clusters with more than 1000 pods or experiencing metrics collection delays. For very large clusters, increase the CPU limit to 500m and memory limit to 256Mi.

User interface

The UI component provides a web interface and REST API for Mission Control. Configure the UI with sufficient resources to handle concurrent user requests and API operations.

Recommended resources
Resource	Value	Notes
CPU request	100m	Baseline for normal operations
CPU limit	1000m	Allows burst during API operations
Memory request	256Mi	Baseline for normal operations
Memory limit	512Mi	Prevents excessive memory usage
Replicas	2	High availability recommended

The UI service handles user authentication, cluster management, and monitoring through a web-based interface while exposing a REST API for programmatic access.

Scale UI resources based on your deployment’s usage patterns. Increase resources when supporting more than 50 concurrent users, experiencing slow API response times, or managing large numbers of clusters. For high user concurrency, increase the CPU limit to 2000m, memory limit to 1Gi, and consider deploying three or more replicas for high availability requirements.

Authentication services

The dex component provides authentication services for Mission Control, handling user authentication and authorization through various methods including static credentials, LDAP, OAuth2, and OpenID Connect.

Recommended resources
Resource	Value	Notes
CPU request	50m	Minimal baseline
CPU limit	500m	Sufficient for most deployments
Memory request	128Mi	Minimal baseline
Memory limit	256Mi	Sufficient for most deployments
Replicas	2	High availability recommended

Increase resources when supporting more than 100 concurrent users, using external authentication providers like LDAP or OAuth, or experiencing authentication delays. For high user concurrency, increase the CPU limit to 1000m, memory limit to 512Mi, and consider deploying three or more replicas for very high availability requirements.

The KOTS Admin Console and Mission Control UI use separate authentication systems. The KOTS Admin Console (accessed through local proxy on port 8800 or through Ingress) manages the Mission Control installation, configuration, licensing, and upgrades, while the Mission Control UI (accessed through Ingress) provides the database management interface for cluster operations.

You can change Mission Control UI passwords through KOTS. After you make changes, restart the Dex pod to apply them.

Database management components

The following components handle database operations such as backup, restore, and cluster management tasks.

Medusa

Medusa provides backup and restore capabilities for database clusters. Medusa runs as a sidecar container in each database pod, so configure these resources per database node rather than per cluster.

Recommended resources per database node
Resource	Value	Notes
CPU request	100m	Baseline for normal operations
CPU limit	1000m	Allows burst during backup operations
Memory request	256Mi	Baseline for normal operations
Memory limit	512Mi	Prevents excessive memory usage

Increase resources when managing large databases with more than 1TB per node, performing frequent backups, or using slow object storage backends. For large databases, increase the CPU limit to 2000m and memory limit to 1Gi. Consider backup scheduling to avoid resource contention during peak usage periods.

For detailed information about Medusa configuration and usage, see Back up data.

Reaper

Reaper provides repair management for database clusters. Deploy one Reaper instance per database cluster to manage repair operations.

Recommended resources per cluster
Resource	Value	Notes
CPU request	200m	Baseline for normal operations
CPU limit	1000m	Allows burst during repair operations
Memory request	512Mi	Baseline for normal operations
Memory limit	1Gi	Prevents excessive memory usage
Replicas	1	One instance per database cluster

Increase resources when managing large clusters with 10 or more nodes, running intensive repair schedules, or managing clusters with large data volumes. For large clusters, increase the CPU limit to 2000m and memory limit to 2Gi. Consider repair scheduling to avoid resource contention during peak usage periods.

For detailed information about Reaper configuration and usage, see Clean up data.

Stargate

Stargate provides API gateway functionality for database access, including REST, GraphQL, and Document APIs. Deploy Stargate when you need HTTP-based access to your database without configuring CQL drivers.

Recommended resources per Stargate instance
Resource	Value	Notes
CPU request	500m	Baseline for API operations
CPU limit	2000m	Allows burst during high traffic
Memory request	1Gi	Baseline for API operations
Memory limit	2Gi	Prevents excessive memory usage
Replicas	2	High availability recommended

Stargate resource requirements depend heavily on API traffic patterns and the number of concurrent connections. Increase resources when supporting more than 100 concurrent API connections, experiencing slow API response times, or handling large result sets. For high-traffic deployments, increase the CPU limit to 4000m, memory limit to 4Gi, and deploy three or more replicas.

Configure Stargate through the MissionControlCluster specification under the stargate section. For more information about Stargate configuration, see the CRD reference documentation.

Sidecar containers

Sidecar containers run alongside database pods to provide additional functionality such as observability and monitoring.

Vector sidecar

Vector runs as a sidecar container in database pods to collect and forward logs and metrics. Configure Vector sidecar resources separately from the observability stack’s Vector aggregator.

Recommended resources per database pod
Resource	Value	Notes
CPU request	50m	Minimal baseline for log collection
CPU limit	500m	Allows burst during high log volume
Memory request	128Mi	Minimal baseline for log buffering
Memory limit	256Mi	Prevents excessive memory usage

Increase resources when database pods generate high log volumes, experiencing log collection delays, or running complex log transformation pipelines. For high-volume logging, increase the CPU limit to 1000m and memory limit to 512Mi.

Configure Vector sidecar resources through the telemetry.vector.resources section in your MissionControlCluster specification.

cqlsh pod

The cqlsh pod provides an ephemeral CQL shell for interactive database access through the Mission Control UI. These pods are created on-demand and terminated after use.

Recommended resources per cqlsh pod
Resource	Value	Notes
CPU request	100m	Minimal baseline for shell operations
CPU limit	500m	Sufficient for most queries
Memory request	256Mi	Minimal baseline for query results
Memory limit	512Mi	Prevents excessive memory usage

Increase resources when running complex queries that return large result sets or experiencing slow query execution. For complex analytical queries, increase the CPU limit to 1000m and memory limit to 1Gi.

The cqlsh pod configuration is managed automatically by Mission Control and does not require manual resource configuration in most cases.

cert-manager

Cert-manager is an external dependency that Mission Control uses to manage TLS certificates for secure communication between components. While cert-manager is not part of Mission Control, you must install and configure it before deploying Mission Control.

Recommended resources for cert-manager
Resource	Value	Notes
CPU request	10m	Minimal baseline
CPU limit	100m	Sufficient for most deployments
Memory request	32Mi	Minimal baseline
Memory limit	128Mi	Sufficient for most deployments
Replicas	1	Single instance sufficient

Cert-manager resource requirements are typically minimal as it primarily watches for certificate resources and renews them before expiration. Increase resources when managing more than 100 certificates or experiencing certificate renewal delays.

For information about cert-manager configuration and proper secret cleanup, see Configure cert-manager for proper secret cleanup.

Configure resources with Helm

Configure resource limits in your Helm values file to apply the recommendations in this guide. The following example shows how to configure resources for control plane operators, user interface components, authentication services, and metrics collection.

# Control plane operators
missionControlOperator:
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 1000m
      memory: 512Mi

cassOperator:
  resources:
    requests:
      cpu: 100m
      memory: 64Mi
    limits:
      cpu: 500m
      memory: 256Mi

k8ssandraOperator:
  resources:
    requests:
      cpu: 100m
      memory: 64Mi
    limits:
      cpu: 500m
      memory: 256Mi

# User interface
ui:
  replicas: 2
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 1000m
      memory: 512Mi

# Authentication
dex:
  replicas: 2
  resources:
    requests:
      cpu: 50m
      memory: 128Mi
    limits:
      cpu: 500m
      memory: 256Mi

# Metrics
kubeStateMetrics:
  resources:
    requests:
      cpu: 10m
      memory: 64Mi
    limits:
      cpu: 200m
      memory: 128Mi

Configure database cluster resources

Configure Medusa, Reaper, Vector sidecar, and optional Stargate resources in your cluster definition to apply the recommendations for database management components. The following example shows how to configure resources for a database cluster.

apiVersion: missioncontrol.datastax.com/v1beta2
kind: MissionControlCluster
metadata:
  name: my-cluster
spec:
  k8ssandra:
    cassandra:
      datacenters:
        - metadata:
            name: dc1
          k8sContext: my-context
          size: 3
          # Medusa configuration (per node)
          medusa:
            resources:
              requests:
                cpu: 100m
                memory: 256Mi
              limits:
                cpu: 1000m
                memory: 512Mi
          # Vector sidecar configuration (per node)
          telemetry:
            vector:
              enabled: true
              resources:
                requests:
                  cpu: 50m
                  memory: 128Mi
                limits:
                  cpu: 500m
                  memory: 256Mi
          # Optional Stargate configuration
          stargate:
            size: 2
            resources:
              requests:
                cpu: 500m
                memory: 1Gi
              limits:
                cpu: 2000m
                memory: 2Gi
    # Reaper configuration (per cluster)
    reaper:
      resources:
        requests:
          cpu: 200m
          memory: 512Mi
        limits:
          cpu: 1000m
          memory: 1Gi

Monitor and adjust resources

Monitor resource usage to validate and adjust your configuration based on actual workload patterns. Use the following commands to check current resource usage and identify resource constraints.

Check operator resource usage:

kubectl top pod -n mission-control -l app.kubernetes.io/name=mission-control

Check kube-state-metrics resource usage:

kubectl top pod -n mission-control -l app.kubernetes.io/name=kube-state-metrics

Check Medusa resource usage in the database namespace:

kubectl top pod -n DATABASE_NAMESPACE -l app.kubernetes.io/name=medusa

Check Reaper resource usage in the database namespace:

kubectl top pod -n DATABASE_NAMESPACE -l app.kubernetes.io/name=reaper

Check Stargate resource usage in the database namespace:

kubectl top pod -n DATABASE_NAMESPACE -l app.kubernetes.io/name=stargate

Replace DATABASE_NAMESPACE with the namespace of the database you want to inspect.

Check for pods hitting resource limits:

kubectl get events -n mission-control --field-selector reason=OOMKilled

Check for CPU throttling:

kubectl describe pod -n mission-control POD_NAME | grep -A 5 "Limits"

Replace POD_NAME with the name of the pod you want to inspect.

Adjust resources based on observed patterns. Increase CPU limits when you observe high CPU usage, increase memory limits when you observe memory pressure, increase both CPU and memory when you observe slow operations, and significantly increase memory limits when you observe OOMKilled events.

Best practices

Follow these best practices when configuring resources for Mission Control components:

Start with the recommended baseline values in this guide and adjust based on monitoring data.
Always configure both requests and limits to ensure predictable scheduling and resource allocation.
Monitor resource usage continuously through the Mission Control dashboards to identify trends and potential issues.
Plan for expected growth over the next 6-12 months when sizing resources.
Validate sizing with realistic workloads before deploying to production.
Document resource adjustments and their impact for future reference.
Review and reassess sizing quarterly or after significant changes to your deployment.

Troubleshoot resource issues

Use the following guidance to diagnose and resolve common resource-related problems in your Mission Control deployment.

Pods stuck in pending state

The scheduler cannot place pods in the Pending state when it cannot find suitable nodes with sufficient resources. Check node resources to verify available capacity, verify that resource requests are not too high for available nodes, ensure sufficient nodes are available in the cluster, and check for resource quotas in the namespace that might prevent scheduling.

Check node resources:

kubectl describe node NODE_NAME

Replace NODE_NAME with the name of the node you’re investigating.

Performance issues

Components become slow or unresponsive when resource limits are too low or when experiencing resource contention. Check current resource usage to identify components approaching their limits, review pod logs for errors or warnings related to resource constraints, check for CPU throttling that might slow down operations, and increase resource limits if usage is consistently high.

Check current resource usage:

kubectl top pod -n mission-control

Review pod logs:

kubectl logs -n mission-control POD_NAME

Replace POD_NAME with the name of the pod you’re investigating.

Check for CPU throttling:

kubectl describe pod -n mission-control POD_NAME

Replace POD_NAME with the name of the pod you’re investigating.

Out of memory errors

Kubernetes terminates pods with OOMKilled status when they exceed their memory limits. Check memory usage patterns to identify components with high memory consumption, review pod events to confirm OOMKilled status, increase memory limits by 50-100%, and monitor for memory leaks if OOMKilled events persist after increasing limits.

Check memory usage:

kubectl top pod -n mission-control --containers

Review pod events:

kubectl get events -n mission-control --field-selector involvedObject.name=POD_NAME

Replace POD_NAME with the name of the pod you’re investigating.

Configure pod resources for Mission Control components

Control plane operators

mission-control-operator

cass-operator

k8ssandra-operator

kube-state-metrics

User interface

Authentication services

Database management components

Medusa

Reaper

Stargate

Sidecar containers

Vector sidecar

cqlsh pod

cert-manager

Configure resources with Helm

Configure database cluster resources

Monitor and adjust resources

Best practices

Troubleshoot resource issues

Pods stuck in pending state

Performance issues

Out of memory errors

See also

Was this helpful?

Give Feedback