Manage collections and tables

A collection is a set of structured vector data. A table is the same but for non-vector data. You can use collections with a Serverless (Vector) database and tables with a Serverless (Non-Vector) database.

Create a collection

Before you can load vector data, you must have an existing collection.

You cannot create a collection or load data to a specific region using the Astra Portal. You must use the initial region you selected when you created the database.

Here’s how to create an empty collection:

  • Astra Portal

  • Python

  • TypeScript

  • Java

Use the Astra Portal to create a collection.

  1. In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.

  2. Click Data Explorer.

    To create a sample collection with a pre-loaded vector dataset, see Load a sample vector dataset.
  3. Optional: Use the Namespace dropdown to select the namespace where you want to create the collection. Otherwise, leave default_keyspace selected to create the collection in the default namespace.

  4. Click Create Collection.

  5. In the Create collection dialog, enter a name for the new collection in the Collection name field.

  6. Optional: Turn on Vector-enabled collection.

    Next, select an Embedding generation method:

    • Bring my own

    • Use an external provider

    • Use an Astra-hosted provider

    1. Select the Bring my own method if you plan to generate your own embeddings and import them when you add data to your collection.

    2. Enter the number of Dimensions of the vectors in your dataset. Clicking this field reveals a list of common embedding models and their dimensions. You can also enter a custom dimension.

    3. Select a Similarity metric that your embedding model will use to compare vectors.

      The available metrics are:

    1. Select an external embedding provider that is enabled and scoped to your database.

      The Create Collection dialog with the Embedding generation method dropdown expanded to show the OpenAI integration as a selectable option.

      Astra DB will use the embedding provider to automatically generate embeddings as needed.

    2. Select the API key that you want to use for your collection. This dropdown menu is only active if you’ve scoped your database to multiple API keys within the same integration.

    3. Select the Embedding model that you want to use to generate embeddings.

    4. If your embedding model supports a range of dimensions, enter the number of Dimensions that you want the generated vectors to have.

    5. Select a Similarity metric that your embedding model will use to compare vectors.

      The available metrics are:

    1. Select an Astra-hosted embedding provider.

      The Create Collection dialog with the Embedding generation method dropdown expanded to show the NVIDIA NeMo integration as a selectable option.

      Astra DB will use the embedding provider to automatically generate embeddings whenever you insert or update data in your collection.

    2. Select the Embedding model that you want to use to generate embeddings. This dropdown menu is only active if more than one embedding model is available for the selected provider.

    3. If your embedding model supports a range of dimensions, enter the number of Dimensions that you want the generated vectors to have.

    4. Select a Similarity metric that your embedding model will use to compare vectors.

      The available metrics are:

    If you turn off Vector-enabled collection, the resulting collection is not vector-enabled. You cannot add vector data to a non-vector collection.

  7. Click Create collection.

    If you get a Collection Limit Reached message, you’ll need to delete a collection before you can create a new one.

An empty collection appears in the list of collections. You can now load data into this collection.

Use the Python client to create a collection. The syntax depends on whether you’re bringing your own embeddings or using an external embeddings provider.

  • Bring my own

  • Use an external provider

# Create a collection. The default similarity metric is cosine. If you're not
# sure what dimension to set, use whatever dimension vector your embeddings
# model produces.
collection = database.create_collection(
    "vector_test",
    dimension=5,
    metric=VectorMetric.COSINE,  # or simply "cosine"
    check_exists=False,
)
print(f"* Collection: {collection.full_name}\n")
# Create a collection. The default similarity metric is cosine. If you're not
# sure what dimension to set, use whatever dimension vector your embeddings
# model produces.
collection = database.create_collection(
    "vectorize_test",
    metric=VectorMetric.COSINE,  # or simply "cosine"
    service=CollectionVectorServiceOptions(
        provider="openai",
        model_name="text-embedding-ada-002",
        authentication={
            "providerKey": "API_KEY_NAME",
        },
    ),
    check_exists=False,
)
print(f"* Collection: {collection.full_name}\n")

Use the TypeScript client to create a collection. The syntax depends on whether you’re bringing your own embeddings or using an external embeddings provider.

  • Bring my own

  • Use an external provider

// Schema for the collection (VectorizeDoc adds the $vector field)
interface Idea extends VectorizeDoc {
  idea: string,
}

(async function () {
  // Create a typed, vector-enabled collection. The default metric is cosine.
  // If you're not sure what dimension to set, use whatever dimension vector
  // your embeddings model produces.
  const collection = await db.createCollection<Idea>('vector_test', {
    vector: {
      dimension: 5,
      metric: 'cosine',
    },
    checkExists: false,
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);
(async function () {
  // Create a typed, vector-enabled collection. The default metric is cosine.
  // If you're not sure what dimension to set, use whatever dimension vector
  // your embeddings model produces.
  const collection = await db.createCollection('vector_test', {
    vector: {
      metric: 'cosine',
      service: {
        provider: "openai",
        modelName: "text-embedding-ada-002",
        authentication: {
          providerKey: "API_KEY_NAME",
        },
      },
    },
    checkExists: false,
  });
  console.log(`* Created collection ${collection.namespace}.${collection.collectionName}`);

Use the Java client to create a collection. The syntax depends on whether you’re bringing your own embeddings or using an external embeddings provider.

  • Bring my own

  • Use an external provider

    // Create a collection. The default similarity metric is cosine. If you're
    // not sure what dimension to set, use whatever dimension vector your
    // embeddings model produces.
    Collection<Document> collection = db
            .createCollection("vector_test", 5, SimilarityMetric.COSINE);
    System.out.println("Created a collection");
    // Create a collection. The default similarity metric is cosine. If you're
    // not sure what dimension to set, use whatever dimension vector your
    // embeddings model produces.
    Collection collection = db.createCollection("vector_test",
            CollectionOptions.builder()
                    .vectorSimilarity(SimilarityMetric.COSINE)
                    .vectorDimension(1536)
                    .vectorize("openai", "text-embedding-ada-002", "test")
                    .build());
    System.out.println("Created a collection");

Delete a collection

You can delete a collection that you’re not using. All of the data in the collection is permanently deleted.

  • Astra Portal

  • Python

  • TypeScript

  • Java

Use the Astra Portal to delete a collection.

  1. In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.

  2. Click Data Explorer.

  3. Use the Namespace dropdown to select the namespace that contains the collection you want to delete.

  4. In the Collections section, click more_vert More next to the collection you want to delete. Select Delete collection.

  5. In the Delete collection dialog, enter the name of the collection to confirm that you want to delete it.

  6. Click Delete collection.

The collection and all of its data is deleted permanently.

Use the Python client to delete a collection.

# Cleanup (if desired)
drop_result = collection.drop()
print(f"\nCleanup: {drop_result}\n")

Use the TypeScript client to delete a collection.

  // Cleanup (if desired)
  await db.dropCollection('vector_test');
  console.log('* Collection dropped.');

  // Close the client
  await client.close();

Use the Java client to delete a collection.

    // Delete the collection
    collection.drop();
    System.out.println("Deleted the collection");

Create a table

Here’s how to create an empty table using the Astra Portal.

  1. In the Astra Portal, go to Databases, and then select your Serverless (Non-Vector) database.

  2. In the Overview tab, note the list of available keyspaces in the Keyspaces section. You will create your table in one of these keyspaces.

  3. Click CQL Console. Wait a few seconds for the token@cqlsh> prompt to appear.

  4. Select the keyspace you want to create your table in.

    use KEYSPACE_NAME;
  5. Create your table.

    CREATE TABLE users (
        firstname text,
        lastname text,
        email text,
        "favorite color" text,
        PRIMARY KEY (firstname, lastname)
    ) WITH CLUSTERING ORDER BY (lastname ASC);
    For more examples of CQL usage, see Developing with CQL API.

You can now load data into this table.

Delete a table

You can delete a table that you’re not using. All of the data in the table is deleted permanently.

  1. In the Astra Portal, go to Databases, and then select your Serverless (Non-Vector) database.

  2. In the Overview tab, note the list of available keyspaces in the Keyspaces section. You will delete a table from one of these keyspaces.

  3. Click CQL Console. Wait a few seconds for the token@cqlsh> prompt to appear.

  4. Select the keyspace containing the table you want to delete.

    use KEYSPACE_NAME;
  5. Get a list of all tables in this keyspace.

    desc tables;
  6. Delete the table and all of its data.

    drop table users;

The table and all of its data is deleted permanently.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com