Manage collections and tables

Collections and tables are containers for data within keyspaces in a database.

Whether you use a collection or table depends on your database type, your data’s schema type, and how strictly you want to enforce the schema:

Collections

Collections use dynamic schemas and store data in documents. With a dynamic schema, each document can have different fields. Collections are best for semi-structured data.

Only Serverless (vector) databases support collections.

Tables

Tables use fixed schemas and store data in rows. With a fixed schema, all rows must have the same columns, and every column must have a value, which can be null. Tables are best for structured data.

Both Serverless (non-vector) and Serverless (vector) databases support tables.

Prerequisites

Consider your data model

While an optimal data model isn’t necessary for small tests, it is important for production applications, including robust development and testing environments for your applications.

Before you create collections and tables for production applications, take time to prepare an effective data model for your application’s needs. Consider the following:

  • Data types and schemas that you want to use.

    For example, if you want to enforce the schema, then you must use tables.

  • Which data needs to be indexed.

    For example, collections index all fields by default. If you want to apply selective indexing, you must create your collection with the Data API and define the indexing clause.

  • How you want to query the data.

    For example, if you want to use Astra DB’s built-in vector search capabilities, then you must store your data in a Serverless (vector) database.

Understand vector data settings

Both collections and tables can store vector and non-vector data, and it is a common practice to store vector data alongside non-vector metadata. However, if you want to use Astra DB’s built-in vector search capabilities, then you must store your data in a Serverless (vector) database.

For tables, you can create, modify, and drop vector columns and indexes at any time.

However, for collections, you must configure vector-related settings when you create the collection. This includes the following:

  • Support for vector data, also known as a vector-enabled collection

  • The number of dimensions and the similarity metric for the vectors in your dataset

  • An embedding provider integration, if you ever plan to use one or might use one in the future

  • Support for hybrid search

  • Indexing

For vector-enabled collections, you decide how to provide embeddings:

  • Generate embeddings outside Astra DB, and then load the embeddings when you insert data.

  • Use an embedding provider integration to automatically generate embeddings.

  • Use both options.

Data manipulation in multi-region databases

For multi-region databases, the Astra Portal’s Data Explorer accesses and manipulates keyspaces, collections, tables, and data from the primary region. If you need to manage your database from a secondary region, you must use the Data API, CQL shell, or a Cassandra driver to connect directly to that region. Generally, accessing secondary regions is for latency optimization when the primary region is geographically distant from the caller or when the primary region is experiencing an outage. However, because multi-region databases follow an eventual consistent model, changes to data in any region are eventually replicated to the database’s other regions.

Collections

Serverless (vector) databases support both collections and tables.

Create a collection

Collection settings are permanent. If you need to change the settings after creating a collection, you must delete the collection and create a new one with the desired settings.

You can create a collection in the Astra Portal, with the Data API, or with the Astra CLI:

Use the Astra Portal
  1. In the Astra Portal, click the name of the database that you want to modify.

    Only Serverless (vector) databases support dynamic-schema collections.

  2. Click Data Explorer.

  3. In the Keyspace field, select the keyspace where you want to create the collection.

    If your database has both collections and tables, the Data Explorer lists all collections and tables under the Collections label.

  4. Click Create Collection.

  5. Enter a name for the collection.

    Collection names must follow these rules:

    • Can contain letters, numbers, and underscores

    • Cannot exceed 48 characters

    • Must be unique within the keyspace

  6. Decide whether you want this collection to support vector data:

    • If you want to store vector data in this collection, enable Vector-enabled collection.

    • If you don’t want to store vector data in this collection, disable Vector-enabled collection.

  7. For vector-enabled collections, select an Embedding generation method:

    • Bring my own: Generate embeddings before you insert data to Astra, and then include the embeddings when you insert data into your collection.

      You must specify the dimensions for the vectors in your dataset and select a similarity metric. You can enter custom dimensions or select from common embedding models and dimensions. The available similarity metrics are Cosine, Dot Product, and Euclidean.

    • Use an embedding provider integration: To automatically generate embeddings when you insert data, attach a vectorize embedding provider integration to your collection, and then configure the model, dimensions, and similarity metric. Available models and dimensions vary by provider.

      For qualifying databases, the Astra-hosted NVIDIA embedding provider integration is selected by default. Other providers require additional setup before you can use them with a collection. For more information, see Generate and store embeddings in Astra DB Serverless databases.

      If your preferred provider or model isn’t supported through an embedding provider integration, use the Bring my own option instead.

      You cannot attach an embedding provider integration to a collection after you create the collection. If you want to use an embedding provider integration, you must select it when you create the collection.

      If you select an embedding provider integration, you can still manually provide embeddings when you insert data. However, you must ensure that the manually-provided embeddings have the same dimensions and model as the automatically-generated embeddings.

  8. Click Create collection.

Use the Astra CLI

Use the astra db create-collection command.

Use the Data API

You can use the Data API to programmatically create a collection. For more information and examples, see Create a collection.

After you create a collection, insert data into the collection.

Troubleshoot collection creation

Collection limit reached or TOO_MANY_INDEXES

If you get a Collection Limit Reached or TOO_MANY_INDEXES message, you must delete a collection before you can create a new one.

Serverless (vector) databases created after June 24, 2024 can have approximately 10 collections. Databases created before this date can have approximately 5 collections. The collection limit is based on the number of indexes.

Embedding provider isn’t available when creating a collection

There are a few reasons why an embedding provider might not be listed when creating a collection:

  • You already enabled the integration: The Add embedding provider integration option only allows you to configure a new embedding provider integration for the first time. If you have already set up an embedding provider integration in your organization, you must manage it through your organization’s Integrations settings. For example, if you want to add another API key, you must do so in the Integration settings, and then create your collection afterwards.

  • Your database isn’t in the integration’s scope: Embedding provider API keys are scoped to specific databases. If you want to use the same integration in multiple databases, you must add all relevant databases to the integration’s API key’s scope in your organization’s Integration settings.

  • The embedding provider isn’t supported for automatic embedding generation: Astra only supports certain embedding providers for automatic embedding generation.

For a full list of supported providers and documentation for each integration, see Generate and store embeddings in Astra DB Serverless databases.

NVIDIA embedding provider isn’t available for a collection

The NVIDIA embedding provider integration is only available in specific regions. For more information, see Integrate NVIDIA as an embedding provider.

Delete a collection

Deleting a collection permanently deletes all data in the collection.

Use the Astra Portal
  1. In the Astra Portal, click the name of the database that you want to modify.

  2. Click Data Explorer.

  3. Click the Keyspace menu, and then select the keyspace that contains the collection you want to delete.

  4. In the Collections section, find the collection you want to delete, click More, and then click Delete collection.

  5. In the Delete collection dialog, enter the collection name, and then click Delete collection.

Use the Astra CLI

Use the astra db delete-collection command.

Data API

You can use the Data API to programmatically delete a collection in a Serverless (vector) database. For more information and examples, see Drop a collection.

Tables

You can create tables in Serverless (vector) and Serverless (non-vector) databases.

Create a table

Use the Astra Portal, CQL shell, or Cassandra drivers

You can use the CQL console (embedded CQL shell) in the Astra Portal, a standalone CQL shell installation, or a Cassandra driver. For more information, see Cassandra Query Language (CQL) for Astra DB.

The following steps explain how to create a table with the CQL console:

  1. In the Astra Portal, click the name of the database that you want to modify.

  2. Note the name of the keyspace where you want to create the table.

  3. Click CQL console, and then wait for the token@cqlsh> prompt to appear.

  4. Select the keyspace that you want to create the table in:

    use KEYSPACE_NAME;
  5. Create a table:

    CREATE TABLE users (
        firstname text,
        lastname text,
        email text,
        "favorite color" text,
        PRIMARY KEY (firstname, lastname)
    ) WITH CLUSTERING ORDER BY (lastname ASC);

    Table names must follow these rules:

    • Can contain letters, numbers, and underscores

    • Cannot exceed 48 characters

    • Must be unique within the keyspace

Use the Data API

You can use the Data API to programmatically create a table in a Serverless (vector) database. For more information and examples, see Create a table.

After you create a table, insert data into the table.

Delete a table

Deleting a table permanently deletes all data in the table.

Use the Astra Portal, CQL shell, or Cassandra drivers

You can use the CQL console (embedded CQL shell) in the Astra Portal, a standalone CQL shell installation, or a Cassandra driver. For more information, see Cassandra Query Language (CQL) for Astra DB.

The following steps explain how to delete a table with the CQL console:

  1. In the Astra Portal, click the name of the database that you want to modify.

  2. Note the name of the keyspace that contains the table you want to delete.

  3. Click CQL console, and then wait for the token@cqlsh> prompt to appear.

  4. Select the keyspace that contains the table you want to delete:

    use KEYSPACE_NAME;
  5. Get a list of all tables in the keyspace:

    desc tables;
  6. Delete the table and all of its data:

    drop table TABLE_NAME;
Use the Astra CLI

Use the astra db delete-table command.

Use the Data API

You can use the Data API to programmatically delete a table in a Serverless (vector) database. For more information and examples, see Drop a table.

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM