Platform components and operations
This reference guide provides detailed information about the various components and operations that make up Mission Control. Use this guide to understand the purpose of each component and to troubleshoot issues when they arise.
Platform architecture
Mission Control consists of several key components deployed as Kubernetes pods, organized into deployments and stateful sets. Each component pod has a specific role in the platform’s operation, and this guide covers:
The platform components work together to provide a complete database management solution:
-
The
mission-control-operator
,cass-operator
, andk8ssandra-operator
pods form the control plane, managing the lifecycle of database clusters and their components. For information about required service account permissions, see Service account permissions. -
The
ui
anddex
pods provide the user interface and authentication services, allowing users to interact with the platform. -
The observability stack includes
loki
,mimir
, andaggregator
pods for collecting and storing metrics and logs from all platform components and database nodes.
-
The database management components include
medusa
andreaper
for providing backup, restore, and repair capabilities for database clusters.
Mission Control deploys these components in its namespace, while it deploys database clusters in separate namespaces. The platform components communicate with each other and with database clusters through the Kubernetes API.
Platform components
The platform components form the foundation of the Mission Control platform, with each component running as one or more Kubernetes pods. You must understand their roles and requirements for effective platform management and troubleshooting.
Quick reference
The following table provides a quick reference for each platform component, including their purpose, common issues, and documentation.
Component | Purpose | Common issues | Documentation |
---|---|---|---|
|
Custom resources |
|
|
|
Apache Cassandra® operations |
|
|
|
K8ssandra operations |
|
|
|
Metrics collection |
|
|
|
Web interface and REST API |
|
|
|
Authentication service |
|
|
|
Log aggregation |
|
|
|
Metrics storage |
|
|
|
Metrics aggregation and forwarding with Vector |
|
|
|
Backup and restore |
|
|
|
Repair management |
|
Detailed component information
The following sections provide detailed information about each platform component, including their roles, configuration options, and common issues.
Control plane components
-
mission-control-operator
-
cass-operator
-
k8ssandra-operator
The Mission Control operator manages custom resources and cluster operations, specifically handling the creation and management of database clusters across various hosting options. Additional resources might be needed for managing complex cluster configurations.
The Mission Control operator reconciles custom resources automatically, particularly during cluster creation or configuration changes like adding nodes or modifying cluster settings.
Review the operator logs in the mission-control-operator
pod for specific errors and check resource creation.
Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
Two main types of issues may affect the mission-control-operator
: reconciliation failures and resource quota problems.
Resource creation issues may stop the mission-control-operator
from functioning properly, especially when deploying new clusters or performing operations like node replacements.
Review the resource creation logs in the mission-control-operator
pod and check resource quotas in the target namespace.
Verify that the service account has the necessary permissions to create resources in the target namespace.
Monitor the cluster creation process through the Mission Control UI or API to identify any issues early, particularly during operations like backup/restore or rolling restarts.
If issues persist, contact DataStax support for assistance.
The cass-operator
manages the lifecycle of all Apache Cassandra®, DataStax Enterprise (DSE), and Hyper-Converged Database (HCD) logical datacenters.
Additional resources might be needed for managing multiple clusters.
The operator reconciles database resources through these automated steps:
-
Cluster-level operators detect new custom resources through the Kubernetes API.
-
Datacenter-level operators detect new datacenter-level custom resources.
-
Datacenter-level operators generate and submit rack-level StatefulSets to their local Kubernetes API.
-
Kubernetes reconciliation loops create pods and storage resources for the database nodes.
-
Operators at the datacenter and cluster levels receive the status of resource creation.
-
When all pods run successfully, the operator begins bootstrap operations.
-
The operator continues operations until all nodes run properly and the Kubernetes API can discover their services.
The operator may experience two main types of issues: reconciliation failures and resource quota problems.
Reconciliation failures occur due to resource constraints or configuration issues during automated operations like rolling restarts, node replacements, or datacenter additions.
When this happens, review the operator logs in the cass-operator
pod for specific errors and check the resource quotas.
Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
You may encounter resource quota issues when you manage multiple clusters across different environments. To resolve these issues, review and adjust the resource quotas in the Kubernetes namespace where Mission Control is deployed. Look for resource leaks after operations like backup/restore or node replacements fail. Monitor resource utilization regularly through the Mission Control dashboards to identify these issues early.
The k8ssandra-operator
manages the lifecycle of all K8ssandra resources including cluster definitions, Reaper, and Medusa components.
Additional resources might be needed for managing complex cluster configurations.
The operator reconciles K8ssandra resources automatically, particularly during cluster creation or configuration changes like adding nodes or modifying cluster settings.
Review the operator logs in the k8ssandra-operator
pod for specific errors and check resource creation.
Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
Common reconciliation issues include failed cluster provisioning and configuration validation errors during operations like cluster upgrades or datacenter modifications.
Resource creation issues can prevent the operator from functioning properly, especially when deploying new clusters or performing operations like node replacements. Review the resource creation logs in the operator pod and check resource quotas in the target namespace. Verify that the service account has the necessary permissions to create resources in the target namespace. Monitor the cluster creation process through the Mission Control UI or API to identify any issues early, particularly during operations like backup/restore or rolling restarts. If issues persist, contact DataStax support for assistance.
User interface components
The ui
component provides the web interface and REST API for Mission Control.
The UI service handles user authentication, cluster management, and monitoring through a web-based interface. It also exposes a REST API for programmatic access to Mission Control functionality.
Common issues include service availability problems and resource quota issues. Review the UI service logs for specific errors and check resource quotas. Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
If you cannot access the UI service, check the pod status and logs for errors. Ensure that you properly configured the service and that all dependencies are available. Monitor the service through the Mission Control dashboards to identify any issues early. If issues persist, contact DataStax support for assistance.
Authentication components
The dex
component provides authentication services for Mission Control.
Dex handles user authentication and authorization, supporting various authentication methods including:
-
Static credentials
-
LDAP
-
OAuth2
-
OpenID Connect
Common issues include authentication failures and resource quota problems. Review the Dex service logs for specific errors and check resource quotas. Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
If you encounter authentication issues, check the Dex configuration and ensure that you properly configured all authentication providers. Monitor the service through the Mission Control dashboards to identify any issues early. If issues persist, contact DataStax support for assistance.
Observability components
-
kube-state-metrics
-
loki
-
mimir
-
aggregator
The kube-state-metrics
component monitors the state of various Kubernetes resources and makes them available for monitoring and alerting.
Review the metrics logs in the kube-state-metrics
pod for specific errors and check resource quotas.
Verify that the service account has the necessary permissions as described in the Service account permissions documentation.
Common issues include collection failures and resource quota problems during high-volume operations.
Resource quota issues can prevent the component from functioning properly, especially when collecting metrics from large clusters. Review the resource quotas in the target namespace and adjust them if necessary. Monitor the metrics collection process through the Mission Control dashboards to identify any issues early. If issues persist, contact DataStax support for assistance.
Loki is the log aggregation system for Mission Control. It collects and stores logs from all platform components and database nodes.
Mimir is the metrics storage and query system for Mission Control. It provides long-term storage and efficient querying of metrics collected from all platform components and database nodes.
The aggregator
component controls Vector for metrics aggregation in Mission Control.
It collects metrics from all platform components and database nodes and forwards them to Prometheus.
Database management components
-
Medusa
-
Reaper
Medusa is the backup and restore system for Mission Control. It provides point-in-time backup and restore capabilities for database clusters.
For detailed information about Medusa configuration and usage, see Backup and restore.
Reaper is the repair management system for Mission Control. It provides automated repair scheduling and execution for database clusters.
For detailed information about Reaper configuration and usage, see Repair management.
Deployment considerations
This section covers important considerations for deploying and managing Mission Control in various environments.
Security context
Configure workloads to comply with security measures:
-
Use non-root user settings
-
Implement read-only file systems where appropriate
-
Manage capabilities according to security requirements
-
Follow the principle of least privilege for service accounts
Performance best practices
Follow these recommendations for optimal performance:
-
Utilize external object storage solutions for long-term data retention
-
Deploy a minimum of two nodes for platform workloads
-
Monitor and adjust resource allocations based on workload
-
Implement proper resource limits and requests
Kubernetes compatibility
Mission Control supports various deployment environments:
-
Bare-metal Kubernetes clusters
-
Cloud-managed Kubernetes services:
-
Amazon EKS
-
Google GKE
-
Azure AKS
-
-
On-premises infrastructure
-
Cloud-based infrastructure
The system is packaged as a Helm chart, allowing for customizable deployments through:
-
KOTS admin console
-
Helm-based tooling
Configuration management
Manage Mission Control configuration through:
-
values.yaml
file for chart and sub-chart settings -
Version-controlled configuration changes
-
Iterative updates and modifications
-
Environment-specific customizations
Component instances and deployment types
The following table lists all component instances and their deployment types.
Components with a -n
suffix where n
is a number are deployed as multiple instances for high availability and scalability.
For example, loki-backend-0
, loki-backend-1
, and loki-backend-2
represent three instances of the Loki backend component.
Component instance | Purpose | Deployment type |
---|---|---|
|
Platform operator |
Deployment |
|
Apache Cassandra® operator |
Deployment |
|
K8ssandra operator |
Deployment |
|
Log storage backend |
StatefulSet |
|
Log query gateway |
Deployment |
|
Log read operations |
Deployment |
|
Log write operations |
Deployment |
|
Metrics ingestion |
StatefulSet |
|
Metrics compaction |
StatefulSet |
|
Metrics distribution |
Deployment |
|
Alert management |
StatefulSet |
|
Metrics storage gateway |
StatefulSet |
|
Backup cleanup |
Job |
The number of instances for StatefulSet components is determined by the configuration in the Helm values. For more information about configuring these components, see Observability metrics.