• Glossary
  • Support
  • Downloads
  • DataStax Home
Get Live Help
Expand All
Collapse All

DataStax Astra DB Serverless Documentation

    • Overview
      • Release notes
      • Astra DB FAQs
      • Astra DB Architecture FAQ
      • Astra DB glossary
      • Get support
    • Getting Started
      • Grant a user access
      • Load and retrieve data
        • Use DSBulk to load data
        • Use Data Loader in Astra Portal
      • Connect a driver
      • Build sample apps
      • Use integrations
    • Planning
      • Plan options
      • Database regions
    • Securing
      • Security highlights
      • Security guidelines
      • Default user permissions
      • Change your password
      • Reset your password
      • Authentication and Authorization
      • Astra DB Plugin for HashiCorp Vault
    • Connecting
      • Connecting private endpoints
        • AWS Private Link
        • Azure Private Link
        • GCP Private Endpoints
        • Connecting custom DNS
      • Connecting Change Data Capture (CDC)
      • Connecting CQL console
      • Connect the Spark Cassandra Connector to Astra
      • Drivers for Astra DB
        • Connecting C++ driver
        • Connecting C# driver
        • Connecting Java driver
        • Connecting Node.js driver
        • Connecting Python driver
        • Connecting Legacy drivers
        • Drivers retry policies
      • Get Secure Connect Bundle
    • Migrating
      • Components
      • FAQs
      • Preliminary steps
        • Feasibility checks
        • Deployment and infrastructure considerations
        • Create target environment for migration
        • Understand rollback options
      • Phase 1: Deploy ZDM Proxy and connect client applications
        • Set up the ZDM Proxy Automation with ZDM Utility
        • Deploy the ZDM Proxy and monitoring
        • Configure Transport Layer Security
        • Connect client applications to ZDM Proxy
        • Leverage metrics provided by ZDM Proxy
        • Manage your ZDM Proxy instances
      • Phase 2: Migrate and validate data
      • Phase 3: Enable asynchronous dual reads
      • Phase 4: Change read routing to Target
      • Phase 5: Connect client applications directly to Target
      • Troubleshooting
        • Troubleshooting tips
        • Troubleshooting scenarios
      • Glossary
      • Contribution guidelines
      • Release Notes
    • Managing
      • Managing your organization
        • User permissions
        • Pricing and billing
        • Audit Logs
        • Bring Your Own Key
          • BYOK AWS Astra Portal
          • BYOK GCP Astra Portal
          • BYOK AWS DevOps API
          • BYOK GCP DevOps API
        • Configuring SSO
          • Configure SSO for Microsoft Azure AD
          • Configure SSO for Okta
          • Configure SSO for OneLogin
      • Managing your database
        • Create your database
        • View your databases
        • Database statuses
        • Use DSBulk to load data
        • Use Data Loader in Astra Portal
        • Monitor your databases
        • Export metrics to third party
          • Export metrics via Astra Portal
          • Export metrics via DevOps API
        • Manage access lists
        • Manage multiple keyspaces
        • Using multiple regions
        • Terminate your database
      • Managing with DevOps API
        • Managing database lifecycle
        • Managing roles
        • Managing users
        • Managing tokens
        • Managing BYOK AWS
        • Managing BYOK GCP
        • Managing access list
        • Managing multiple regions
        • Get private endpoints
        • AWS PrivateLink
        • Azure PrivateLink
        • GCP Private Service
    • Astra CLI
    • Astra Block
      • Quickstart
      • FAQ
      • Data model
      • About NFTs
    • Developing with Stargate APIs
      • Develop with REST
      • Develop with Document
      • Develop with GraphQL
        • Develop with GraphQL (CQL-first)
        • Develop with GraphQL (Schema-first)
      • Develop with gRPC
        • gRPC Rust client
        • gRPC Go client
        • gRPC Node.js client
        • gRPC Java client
      • Develop with CQL
      • Tooling Resources
      • Node.js Document API client
      • Node.js REST API client
    • Stargate QuickStarts
      • Document API QuickStart
      • REST API QuickStart
      • GraphQL API CQL-first QuickStart
    • API References
      • DevOps REST API v2
      • Stargate Document API v2
      • Stargate REST API v2
  • DataStax Astra DB Serverless Documentation
  • Overview
  • Astra DB Architecture FAQ

Astra DB Architecture FAQ

Astra DB is a globally distributed, serverless, multi-model database service built by DataStax to satisfy your needs on your cloud provider. It’s the first and only serverless and multi-region database service based on an open-source NoSQL database, specifically Apache Cassandra®.

The following are frequently asked questions that summarize how Astra DB and Cassandra work together.

How is Astra DB different from DSE?

DataStax has adapted Cassandra into a multi-tenant database for the serverless needs of Astra users. Our design enables fine-grained, elastic scalability of individual components to meet the capacity demands of modern application workloads.

Cassandra has a proven track record of successfully managing data workloads for some of the most mission-critical, high-throughput, global-scale, always-on, real-time applications.

This scale-all-or-nothing approach means that scaling reads and writes independently from compaction is not an option, and scaling down presents a unique set of challenges to operators.

How is Astra DB different from DSE?

While you can run DataStax Enterprise (DSE) both on premises and in the cloud, Astra DB allows you to create databases in minutes, with reduced complexity, fewer operational concerns, and an approachable entry point for developing your applications. Astra DB is a true database-as-a-service (DBaaS) offering that provides simplified cluster management with flexible pricing. DSE includes Advanced Workloads (DSE Search, Analytics, and Graph), which are not available in Astra DB. Read this whitepaper for an in-depth understanding of the cloud-native architecture of Astra DB.

What is the Microservices Architecture?

Our new Microservices Architecture is designed to be cloud-native and multi-cloud. These services run on top of Kubernetes and support data storage through fully managed object storage services in AWS, Google Cloud, and Azure enable Astra DB to operate in any of the three public clouds.

Which cloud vendors does Astra DB support?

Astra DB supports creating databases on Amazon Web Services (AWS), Google Cloud, and Microsoft Azure.

How does the control plane work with Astra DB?

Astra DB is composed of many smaller independent services that belong to the control plane, data plane, infrastructure services, and object storage services. Kubernetes is used to scale and orchestrate all services in the data plane and infrastructure services.

The control plane is responsible for maintaining and configuring the data plane based on customer-specified settings and operational status reported by the data plane. Data plane services are responsible for executing user and application data requests, as well as performing required database maintenance operations in the background, which include:

  • Astra DB K8s Operator

  • Commit Log Replayer Service

  • Compaction Service

  • Controller Service for AIOps

  • Coordination Service

  • Data Service

  • IAM Service

  • Object Storage Services

What is multi-tenancy?

To have a proper multi-tenant system, there are several factors you must consider:

  • A balance between effective cost and performance guarantees

  • Performance (alone)

  • Fault isolation

A tenant that experiences a sudden increase in traffic, software bug, or denial-of-service attack should not have any substantial effect on other tenants that share the same underlying infrastructure.

In Astra DB, there can be multiple Kubernetes clusters, each serving thousands of tenants. Each tenant is assigned to a single, shared-tenancy Kubernetes cluster, a number of services depending on workload requirements, and one S3, GCS or ABS bucket.

How does multi-tenancy work with metadata?

A multi-tenant coordinator or data service is responsible for one or more Cassandra keyspaces and sets of tokens per tenant. Our tenants also share the same metadata service with each tenant metadata being stored separately. Each tenant has a separate bucket in the object storage service where all tenant keyspaces, tables, and data reside.

What is shuffle sharding?

To assign tenants to different services, Astra DB uses a shuffle sharding algorithm that in practice utilizes the following pattern. Shuffle shards may overlap, such as one or more services may belong to two different shuffle shards, we must introduce additional guarantees about overlapping. In particular, for any two shuffle shards, we want the overlap to be at most one data service.

Shuffle sharding helps with tenant performance and fault isolation. Different clusters can have distinct multi-tenancy policies to accommodate customers with different requirements.

How does Astra DB provide smart auto-scaling?

Auto scaling is a hard optimization problem that looks to minimize the total cost of computational resources, while meeting the continuously changing demand of every tenant. Auto scaling enables database service elasticity, which boils down to how many operations per second a tenant currently needs.

Astra DB initially assigns some default rate limits per service per tenant. It then dynamically adjusts those limits up to some predefined maximum allowed values and automatically adds or removes services according to the demand. Auto scaling decisions are made and applied by the controller service for artificial intelligence for IT operations (AIOps) and the Astra DB K8s operator service.

How do Astra DB improvements relate to Cassandra’s compaction, repair, and more?

  • Compaction

    In Astra DB, compaction services use the unified compaction strategy (UCS). It’s a hybrid of Cassandra’s size tiered compaction strategy (STCS), leveled compaction strategy (LCS), and time window compaction strategy (TWCS). Sharding is automatically configurable for different scenarios, including using TWCS to shard time series workloads.

  • Repair

    When using Astra DB, compaction services compact data in SSTables and simultaneously repair any inconsistencies among replicas. This becomes possible because a compaction service has access to every-replica data files through an object storage service. A repair during compaction is ideal because the cost of added repair is very low in this scenario. Repairing while compacting is significantly faster than running compaction and repair separately.

  • Bootstrapping

    In Astra DB, object storage services are responsible for data storage, which is separate from compute. When a new tenant is assigned to a data service:

    • There is no need to stream replica data for the tokens that the service is managing.

    • Data still resides in the object storage and does not need to be moved.

This feature is analogous to creating a new pointer to an existing data set rather than copying the whole data set to create a new one. The whole process is very efficient. Data services do have read caches that get gradually hydrated to speed up read requests.

Astra DB FAQs Astra DB glossary

General Inquiries: +1 (650) 389-6000 info@datastax.com

© DataStax | Privacy policy | Terms of use

DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.

Kubernetes is the registered trademark of the Linux Foundation.

landing_page landingpage