Manage collections and tables
Collections store semi-structured data, in the form of documents, in Serverless (Vector) databases.
Tables store structured data, in the form of rows, in Serverless (Vector) and Serverless (Non-Vector) databases.
Collections and tables in Serverless (Vector) databases can store both vector and non-vector data, if the data is relevant. Consider the needs of your application, and then decide how to segregate your data into separate collections, tables, keyspaces, and databases.
Collections
To manage collections, you must have the appropriate permissions, such as the Database Administrator role. To programmatically manage collections, you need an application token with sufficient permissions.
Create a collection
When you create a collection, you decide if the collection can store structured vector data. This is known as a vector-enabled collection. For vector-enabled collections, you also decide how to provide embeddings. You can bring your own embeddings, automatically generate embeddings with vectorize, or both. You must decide which options you need when you create the collection. For more information, see Vector and vectorize.
You can create a collection in the Astra Portal or with the Data API.
For multi-region databases, the Astra Portal’s Data Explorer accesses and manipulates keyspaces, collections, tables, and data from the primary region. If you need to manage your database from a secondary region, you must use the Data API or CQL shell. Generally, accessing secondary regions is for latency optimization when the primary region is geographically distant from the caller or when the primary region is experiencing an outage. However, because multi-region databases follow an eventual consistent model, changes to data in any region are eventually replicated to the database’s other regions.
-
Astra Portal
-
Data API
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
In the Keyspace field, select the keyspace where you want to create the collection or use
default_keyspace
. -
Click Create Collection.
-
In the Create collection dialog, enter a name for the collection. Collection names can contain no more than 50 characters, including letters, numbers, and underscores.
-
Enable support for vector data, if needed:
-
Enable Vector-enabled collection if you want to store vector data in this collection.
-
Disable Vector-enabled collection if you don’t want to store vector data in this collection.
-
-
For vector-enabled collections, select an Embedding generation method:
-
Bring my own: Select this option if you want to generate your own embeddings and import them when you load data into your collection. You must also enter the number of Dimensions for the vectors in your dataset, and you must select a Similarity metric that your embedding model will use to compare vectors. You can enter custom dimensions or select from common embedding models and dimensions. The available similarity metrics are Cosine, Dot Product, and Euclidean.
-
Use an embedding provider integration: To automatically generate embeddings when you load data, attach an embedding provider integration to your collection. For applicable databases, the built-in NVIDIA embedding provider integration is selected by default. Other providers require additional setup before you can use them with a collection. Available models and dimensions vary by provider. For more information, a full list of supported providers, and links to instructions for each provider, see Auto-generate embeddings with vectorize.
You can’t attach a vectorize integration to a collection after you create the collection. If you want to use vectorize, you must enable it when you create the collection.
You can manually provide embeddings even if the collection has a vectorize integration. However, you must ensure that the manually-provided embeddings have the same dimensions and model as the automatically-generated embeddings.
-
-
Click Create collection.
You can use the Data API to programmatically create a collection.
For more information and examples, see the Data API reference for creating a collection and the documentation for your embedding provider integration.
If you get a Serverless (Vector) databases created after June 24, 2024 can have approximately 10 collections. Databases created before this date can have approximately 5 collections. The collection limit is based on the number of indexes. |
After you create a collection, load data into the collection.
Delete a collection
Deleting a collection permanently deletes all data in the collection. |
-
Astra Portal
-
Data API
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
In the Keyspace field, select the keyspace that contains the collection you want to delete.
-
In the Collections section, locate the collection you want to delete, click more_vert More, and then click Delete collection.
-
In the Delete collection dialog, enter the collection name, and then click Delete collection.
The collection and all of its data are permanently deleted.
You can use the Data API to programmatically delete a collection.
For more information and examples, see the Data API reference for deleting a collection.
Tables
To manage tables, you must have the appropriate permissions, such as the Database Administrator role. To manage tables programmatically, you need an application token with sufficient permissions.
You can’t use the Data Explorer in the Astra Portal to create or manage tables in Serverless (Vector) databases. For Serverless (Non-Vector) databases, you must use the CQL shell. |
Create a table
You can create tables in Serverless (Non-Vector) and Serverless (Vector) databases.
-
Astra Portal (cqlsh)
-
Data API
You can use the built-in CQL shell (cqlsh
) in the Astra Portal, the standalone CQL shell, or a driver to manage tables.
For information about CQL shell and drivers, see Cassandra Query Language (CQL) for Astra DB.
To use the CQL shell in the Astra Portal to create a table, do the following:
-
In the Astra Portal navigation menu, select your database.
-
Note the name of the keyspace where you want to create the table.
-
Click CQL Console, and then wait for the
token@cqlsh>
prompt to appear. -
Select the keyspace that you want to create the table in:
use KEYSPACE_NAME;
-
Create a table:
CREATE TABLE users ( firstname text, lastname text, email text, "favorite color" text, PRIMARY KEY (firstname, lastname) ) WITH CLUSTERING ORDER BY (lastname ASC);
You can use the Data API to programmatically create a table.
For more information and examples, see the Data API reference for creating a table.
After you create a table, load data into the table.
Delete a table
Deleting a table permanently deletes all data in the table. |
-
Astra Portal (cqlsh)
-
Data API
You can use the built-in CQL shell (cqlsh
) in the Astra Portal, the standalone CQL shell, or a driver to manage tables.
For information about CQL shell and drivers, see Cassandra Query Language (CQL) for Astra DB.
To use the CQL shell in the Astra Portal to delete a table, do the following:
-
In the Astra Portal, go to Databases, and then select your database.
-
Note the name of the keyspace that contains the table you want to delete.
-
Click CQL Console, and then wait for the
token@cqlsh>
prompt to appear. -
Select the keyspace that contains the table you want to delete:
use KEYSPACE_NAME;
-
Get a list of all tables in the keyspace:
desc tables;
-
Delete the table and all of its data:
drop table TABLE_NAME;
The table and its data are deleted.
You can use the Data API to programmatically delete a table.
For more information and examples, see the Data API reference for deleting a table.