Manage collections and tables

Collections and tables are containers for data within keyspaces in a database.

Whether you use a collection or table depends on your database type, your data’s schema type, and how strictly you want to enforce the schema:

Collections: Collections use dynamic schemas and store data in documents. With a dynamic schema, each document can have different fields. Collections are best for semi-structured data.

You can create collections only in Serverless (Vector) databases.
Tables: Tables use fixed schemas and store data in rows. With a fixed schema, all rows must have the same columns, and every column must have a value, which can be null. Tables are best for structured data.

You can create tables in Serverless (Non-Vector) and Serverless (Vector) databases.

Prerequisites

An Astra DB Serverless database with a keyspace.
A role with table permissions, such as the Database Administrator role.

Table permissions apply to both collections and tables.

To create collections and tables with the Data API, you need an application token with an appropriately scoped role.

Consider your data model

While an optimal data model isn’t necessary for small tests, it is important for production applications, including robust development and testing environments for your applications.

Before you create collections and tables for production applications, take time to prepare an effective data model for your application’s needs. Consider the following:

Data types and schemas that you want to use.

For example, if you want to enforce the schema, then you must use tables.
Which data needs to be indexed.

For example, collections index all fields by default. If you want to apply selective indexing, you must create your collection with the Data API and define the indexing clause.
How you want to query the data.

For example, if you want to use Astra DB’s built-in vector search capabilities, then you must store your data in a Serverless (Vector) database.

Understand vector data settings

Both collections and tables can store vector and non-vector data, and it is a common practice to store vector data alongside non-vector metadata. However, if you want to use Astra DB’s built-in vector search capabilities, then you must store your data in a Serverless (Vector) database.

For tables, you can create, modify, and drop vector columns and indexes at any time.

However, for collections, you must configure vector-related settings when you create the collection. This includes the following:

Support for vector data, also known as a vector-enabled collection
The number of dimensions and the similarity metric for the vectors in your dataset
An embedding provider integration, if you ever plan to use one or might use one in the future
Support for hybrid search
Indexing

For vector-enabled collections, you decide how to provide embeddings:

Generate embeddings outside Astra, and then load the embeddings when you insert data.
Use an embedding provider integration to automatically generate embeddings.
Use both options.

Data manipulation in multi-region databases

For multi-region databases, the Astra Portal’s Data Explorer accesses and manipulates keyspaces, collections, tables, and data from the primary region. If you need to manage your database from a secondary region, you must use the Data API or CQL. Generally, accessing secondary regions is for latency optimization when the primary region is geographically distant from the caller or when the primary region is experiencing an outage. However, because multi-region databases follow an eventual consistent model, changes to data in any region are eventually replicated to the database’s other regions.

Collections

Collections are only available in Serverless (Vector) databases. For Serverless (Non-Vector) databases, see Create a table.

Create a collection

Collection settings are permanent. If you need to change the settings after creating a collection, you must delete the collection and create a new one with the desired settings.

You can create a collection in the Astra Portal or with the Data API.

While Serverless (Vector) databases can have both collections and tables, the Astra Portal’s Data Explorer only supports collection creation, and it lists all collections and tables under the Collections label.

To create tables in a Serverless (Vector) database, you must use CQL or the Data API.

Astra Portal
Data API

In the Astra Portal, click the name of your Serverless (Vector) database.
Click Data Explorer.
In the Keyspace field, select the keyspace where you want to create the collection or use default_keyspace.
Click Create Collection.
In the Create collection dialog, enter a name for the collection.
Rules for collection names
- Can contain letters, numbers, and underscores
- Cannot exceed 48 characters
- Must be unique within the keyspace
Decide whether you want this collection to support vector data:
- If you want to store vector data in this collection, enable Vector-enabled collection.
- If you don’t want to store vector data in this collection, disable Vector-enabled collection.

For vector-enabled collections, select an Embedding generation method:

Bring my own: Select this option if you only want to generate your own embeddings and import them when you insert data into your collection. Then, enter the number of dimension for the vectors in your dataset, and select a similarity metric. You can enter custom dimensions or select from common embedding models and dimensions. The available similarity metrics are Cosine, Dot Product, and Euclidean.

Use an embedding provider integration: If you want to automatically generate embeddings when you insert data, attach an embedding provider integration to your collection, and then configure the model, dimensions, and similarity metric. Available models and dimensions vary by provider.

For applicable databases, the built-in NVIDIA embedding provider integration is selected by default. Other providers require additional setup before you can use them with a collection. For more information, see Auto-generate embeddings with vectorize.

You cannot attach an embedding provider integration to a collection after you create the collection. If you want to use an embedding provider integration, you must enable it when you create the collection.

You can manually provide embeddings even if the collection has a vectorize integration. However, you must ensure that the manually-provided embeddings have the same dimensions and model as the automatically-generated embeddings.

Click Create collection.

You can use the Data API to programmatically create a collection.

For more information and examples, see the Data API reference for creating a collection.

After you create a collection, insert data into the collection.

Troubleshoot collection creation

Collection limit reached or TOO_MANY_INDEXES

If you get a Collection Limit Reached or TOO_MANY_INDEXES message, you must delete a collection before you can create a new one.

Serverless (Vector) databases created after June 24, 2024 can have approximately 10 collections. Databases created before this date can have approximately 5 collections. The collection limit is based on the number of indexes.

Embedding provider isn’t available when creating a collection

There are a few reasons why an embedding provider might not be listed when creating a collection:

You already enabled the integration: The Add embedding provider integration option only allows you to configure a new embedding provider integration for the first time. If you have already set up an embedding provider integration in your organization, you must manage it through your organization’s Integrations settings. For example, if you want to add another API key, you must do so in the Integration settings, and then create your collection afterwards.
Your database isn’t in the integration’s scope: Embedding provider API keys are scoped to specific databases. If you want to use the same integration in multiple databases, you must add all relevant databases to the integration’s API key’s scope in your organization’s Integration settings.
The embedding provider isn’t supported for automatic embedding generation: Astra DB only supports certain embedding providers for automatic embedding generation.

For a full list of supported providers and documentation for each integration, see Auto-generate embeddings with vectorize.

NVIDIA embedding provider isn’t available for a collection

The NVIDIA embedding provider integration is only available in specific regions. For more information, see Integrate NVIDIA as an embedding provider.

Delete a collection

Deleting a collection permanently deletes all data in the collection.

Astra Portal
Data API

In the Astra Portal, click the name of your Serverless (Vector) database.
Click Data Explorer.
In the Keyspace field, select the keyspace that contains the collection you want to delete.
In the Collections section, find the collection you want to delete, click More, and then click Delete collection.
In the Delete collection dialog, enter the collection name, and then click Delete collection.

The collection and all of its data are permanently deleted.

You can use the Data API to programmatically delete a collection in a Serverless (Vector) database.

For more information and examples, see the Data API reference for deleting a collection.

Tables

To create tables in Serverless (Vector) databases, you must use CQL or the Data API.

For Serverless (Non-Vector) databases, you must use CQL.

Create a table

You can create tables in Serverless (Non-Vector) and Serverless (Vector) databases.

Astra Portal (cqlsh)
Data API

You can use the built-in cqlsh in the Astra Portal, standalone cqlsh, or a driver to manage tables. For information about cqlsh and drivers, see Cassandra Query Language (CQL) for Astra DB.

To use the cqlsh in the Astra Portal to create a table, do the following:

In the Astra Portal, click the name of the database where you want to create a table.
Note the name of the keyspace where you want to create the table.
Click CQL Console, and then wait for the token@cqlsh> prompt to appear.
Select the keyspace that you want to create the table in:
```
use KEYSPACE_NAME;
```

Create a table:

CREATE TABLE users (
    firstname text,
    lastname text,
    email text,
    "favorite color" text,
    PRIMARY KEY (firstname, lastname)
) WITH CLUSTERING ORDER BY (lastname ASC);

Rules for table names

Can contain letters, numbers, and underscores
Cannot exceed 48 characters
Must be unique within the keyspace

You can use the Data API to programmatically create a table in a Serverless (Vector) database.

For more information and examples, see the Data API reference for creating a table.

After you create a table, insert data into the table.

Delete a table

Deleting a table permanently deletes all data in the table.

Astra Portal (cqlsh)
Data API

You can use the built-in cqlsh in the Astra Portal, standalone cqlsh, or a driver to manage tables. For information about cqlsh and drivers, see Cassandra Query Language (CQL) for Astra DB.

To use the cqlsh in the Astra Portal to delete a table, do the following:

In the Astra Portal, click the name of your database.
Note the name of the keyspace that contains the table you want to delete.
Click CQL Console, and then wait for the token@cqlsh> prompt to appear.
Select the keyspace that contains the table you want to delete:
```
use KEYSPACE_NAME;
```
Get a list of all tables in the keyspace:
```
desc tables;
```
Delete the table and all of its data:
```
drop table TABLE_NAME;
```

The table and its data are deleted.

You can use the Data API to programmatically delete a table in a Serverless (Vector) database.

For more information and examples, see the Data API reference for deleting a table.

Manage collections and tables

Prerequisites

Consider your data model

Understand vector data settings

Data manipulation in multi-region databases

Collections

Create a collection

Troubleshoot collection creation

Delete a collection

Tables

Create a table

Delete a table

Was this helpful?

Give Feedback