Create a collection

Creates a new collection in a Serverless (Vector) database.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

  • Python

  • TypeScript

  • Java

  • curl

Creates a collection with the specified parameters.

Returns a Collection object. You can use this object to work with documents in the collection.

Unless you specify the document_type parameter, the collection is typed as Collection[dict]. For more information, see Typing support.

Example response:

Collection(name="COLLECTION_NAME", keyspace="default_keyspace", database.api_endpoint="ASTRA_DB_API_ENDPOINT", api_options=FullAPIOptions(token=StaticTokenProvider("APPLICATION_TOKEN"...), ...))

Creates a collection with the specified parameters.

Returns a promise that resolves to a Collection<Schema> object. You can use this object to work with documents in the collection.

A Collection is typed as Collection<Schema>, where Schema defaults to SomeDoc (Record<string, any>). Providing the specific Schema type enables stronger typing for collection operations. For more information, see Typing Collections and Tables.

Creates a collection with the specified parameters.

Returns a Collection object.

You can use this object to work with documents in the collection.

Creates a collection with the specified parameters.

If the command succeeds, the response indicates the success.

Example response:

{
  "status": {
    "ok": 1
  }
}

Parameters

You can’t edit a collection’s definition after you create the collection.

  • Python

  • TypeScript

  • Java

  • curl

The signature of this method changed in Python client version 2.0.

If you are using an earlier version, DataStax recommends upgrading to the latest version. For more information, see Data API client upgrade guide.

Use the create_collection method, which belongs to the astrapy.Database class.

Method signature
create_collection(
  name: str,
  *,
  definition: CollectionDefinition | dict[str, Any] | None,
  document_type: type[Any],
  keyspace: str,
  collection_admin_timeout_ms: int,
  embedding_api_key: str | EmbeddingHeadersProvider,
  spawn_api_options: APIOptions,
) -> Collection

Most astrapy objects have an asynchronous counterpart, for use within the asyncio framework. To get an AsyncCollection, use the create_collection method of instances of AsyncDatabase, or alternatively the to_async method of the synchronous Collection class. See the AsyncCollection client reference for details about the async API.

Name Type Summary

name

str

The name of the new collection.

definition

CollectionDefinition

The full configuration for the collection. See the CollectionDefinition table for more details.

You can define definition in a CollectionDefinition object, or you can use the fluent interface of CollectionDefinition.

Plain Python dictionaries can be passed for definition as well, provided they mirror the structure of CollectionDefinition objects.

document_type

type

Optional. A formal specifier for the type checker. If provided, document_type must match the type hint specified in the assignment. For more information, see Typing support.

Default: Collection[dict]

keyspace

str

The keyspace in which to create the collection.

Default: The working keyspace for the database.

collection_admin_timeout_ms

int

A timeout, in milliseconds, to impose on the underlying API request. If not provided, the corresponding Database defaults apply.

embedding_api_key

str | EmbeddingHeadersProvider

Optional. This only applies to collections with a vectorize embedding provider integration.

Use this option to provide the API key directly with headers instead of using an API key in the Astra DB KMS.

The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize service is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize.

spawn_api_options

APIOptions

A complete or partial specification of the APIOptions to override the defaults inherited from the Database. Use this to customize the interaction of the Python client with the collection. For example, you can change the serdes options or default timeouts. If APIOptions is passed together with a named parameter such as a timeout, the latter takes precedence over the corresponding spawn_api_options setting.

Properties of CollectionDefinition
Name Type Summary

vector

CollectionVectorOptions

Optional. The vector configuration for the collection. This includes things like the vector dimension and similarity metric. This also includes settings for server-side embedding generation if you want your collection to have vectorize enabled.

Required for vector search and hybrid search.

lexical

CollectionLexicalOptions

Optional. The lexical search configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The CollectionLexicalOptions object has the following properties:

  • enabled (boolean): Whether to enable lexical search for the collection. Use this to disable lexical search for the collection. Required to support hybrid search.

  • analyzer: A string describing a built-in analyzer, or a JSON object describing an analyzer configuration.

    Strings must be one of the supported built-in analyzers.

    JSON objects must follow the specifications in Find data with CQL analyzers.

See the example for usage.

Default: A CollectionLexicalOptions object with an enabled value of True and an analyzer value of "standard", which corresponds to the standard Apache Lucene™ analyzer.

rerank

CollectionRerankOptions

Optional. The reranker configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The CollectionRerankOptions object has the following properties:

See the example for usage.

Default: A RerankServiceOptions object with an enabled value of True and a service value corresponding to the NVIDIA llama-3.2-nv-rerankqa-1b-v2 reranking model. This means that reranking is enabled by default.

indexing

dict

Optional. The selective indexing configuration for the collection.

Default: All fields of all documents.

default_id

CollectionDefaultIDOptions

Optional. Specifies the default ID type for documents in the collection. This is used when you insert a document without an _id field.

Can be one of:

  • CollectionDefaultIDOptions(DefaultIdType.OBJECTID): Each autogenerated _id value is an objectId as provided by the bson library.

  • CollectionDefaultIDOptions(DefaultIdType.UUIDV7): Each autogenerated _id value is a version 7 UUID. This is designed as a replacement for version 1 time UUID, and it is recommended for use in new systems.

  • CollectionDefaultIDOptions(DefaultIdType.UUIDV6): Each autogenerated _id value is a version 6 UUID. This is field-compatible with version 1 time UUIDs, and it supports lexicographical sorting.

  • CollectionDefaultIDOptions(DefaultIdType.UUID): Each autogenerated _id value is a version 4 UUID. This type is analogous to the uuid type and functions in Apache Cassandra®.

  • CollectionDefaultIDOptions(DefaultIdType.DEFAULT): Each autogenerated _id value is a string form of a version 4 UUID.

See the example for usage.

For more information, see Document IDs.

Default: CollectionDefaultIDOptions(DefaultIdType.DEFAULT)

Use the createCollection method, which belongs to the Db class.

Method signature
async createCollection<Schema extends SomeDoc = SomeDoc>(
  name: string,
  options?: {
    vector?: CollectionVectorOptions,
    indexing?: CollectionIndexingOptions<Schema>,
    defaultId?: CollectionDefaultIdOptions,
    lexical?: CollectionLexicalOptions,
    rerank?: CollectionRerankOptions,
    logging?: DataAPILoggingConfig,
    keyspace?: string,
    embeddingApiKey?: string | EmbeddingHeadersProvider,
    serdes?: CollectionSerDesConfig,
    timeoutDefaults?: TimeoutDescriptor,
    timeout?: number | TimeoutDescriptor,
  }
): Collection<Schema>
Name Type Summary

name

string

The name of the new collection.

options

CreateCollectionOptions

Optional. The options for this operation. See the options table for more details.

Properties of options:
Name Type Summary

vector

CollectionVectorOptions

Optional. The vector configuration for the collection. This includes things like the vector dimension and similarity metric. This also includes settings for server-side embedding generation if you want your collection to have vectorize enabled.

Required for vector search and hybrid search.

lexical

CollectionLexicalOptions

Optional. The lexical search configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The CollectionLexicalOptions object has the following properties:

  • enabled (boolean): Whether to enable lexical search for the collection. Use this to disable lexical search for the collection. Required to support hybrid search.

  • analyzer: A string describing a built-in analyzer, or a JSON object describing an analyzer configuration.

    Strings must be one of the supported built-in analyzers.

    JSON objects must follow the specifications in Find data with CQL analyzers.

See the example for usage.

Default: A CollectionLexicalOptions object with enabled: true and analyzer: "STANDARD", which corresponds to the standard Apache Lucene™ analyzer.

rerank

CollectionRerankOptions

Optional. The reranker configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The CollectionRerankOptions object has the following properties:

See the example for usage.

Default: A RerankServiceOptions object with enabled: true and a service value corresponding to the NVIDIA llama-3.2-nv-rerankqa-1b-v2 reranking model. This means that reranking is enabled by default.

indexing

CollectionIndexingOptions<Schema>

Optional. The selective indexing configuration for the collection.

See the example to specify which fields to index and the example to specify which fields to not index for usage.

Default: All fields of all documents.

defaultId

CollectionDefaultIdOptions

Optional. Specifies the default ID type for documents in the collection. This is used when you insert a document without an _id field.

Can be one of:

  • {type: "objectId"}: Each autogenerated _id value is an objectId as provided by the bson library.

  • {type: "uuidv7"}: Each autogenerated _id value is a version 7 UUID. This is designed as a replacement for version 1 time UUID, and it is recommended for use in new systems.

  • {type: "uuidv6"}: Each autogenerated _id value is a version 6 UUID. This is field-compatible with version 1 time UUIDs, and it supports lexicographical sorting.

  • {type: "uuid"}: Each autogenerated _id value is a version 4 UUID. This type is analogous to the uuid type and functions in Apache Cassandra®.

See the example for usage.

For more information, see Document IDs.

Default: Each autogenerated _id value is a string form of a version 4 UUID

embeddingApiKey

string | EmbeddingHeadersProvider

Optional. This only applies to collections with a vectorize embedding provider integration.

Use this option to provide the API key directly with headers instead of using an API key in the Astra DB KMS.

The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize service is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize.

keyspace

string

The keyspace in which to create the collection.

Default: The working keyspace for the database.

logging

string

Optional. The configuration for logging events emitted by the DataAPIClient.

serdes

string

Optional. The configuration for serialization/deserialization by the DataAPIClient.

For more information, see Custom Ser/Des.

timeoutDefaults

TimeoutDescriptor

Optional.

The default timeout(s) to apply to operations performed on this Collection instance. You can specify requestTimeoutMs, generalMethodTimeoutMs, and collectionAdminTimeoutMs.

Details about the timeoutDefaults parameter

The default timeout options for any operation performed on this Collection instance.

The TimeoutDescriptor object can contain these properties:

  • requestTimeoutMs (number): The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: 10 seconds.

  • generalMethodTimeoutMs (number): The maximum time, in milliseconds, that the whole operation, which may involve multiple HTTP requests, can take. Default: 30 seconds.

  • collectionAdminTimeoutMs (number): The maximum time, in milliseconds, for collection admin operations like creating, dropping, and listing collections. Default: 60 seconds.

timeout

number | TimeoutDescriptor

Optional.

The timeout to apply to this method.

Only collectionAdminTimeoutMs applies to this method. This is the maximum time, in milliseconds, for collection admin operations like creating, dropping, and listing collections.

Default: 60 seconds, unless you specified a different default along the Options Hierarchy.

Use the createCollection method, which belongs to the com.datastax.astra.client.Database class.

Method signature
Collection<Document> createCollection(String collectionName)
Collection<Document> createCollection(
  String collectionName,
  CollectionDefinition collectionDefinition
)
Collection<Document> createCollection(
  String collectionName,
  CollectionDefinition collectionDefinition,
  CreateCollectionOptions options
)
<T> Collection<T> createCollection(
  String collectionName,
  Class<T> documentClass
)
<T>  Collection<T> createCollection(
  String collectionName,
  CollectionDefinition collectionDefinition,
  Class<T> documentClass
)
<T> Collection<T> createCollection(
  String collectionName,
  CollectionDefinition collectionDefinition,
  Class<T> documentClass,
  CreateCollectionOptions options
)
Name Type Summary

collectionName

String

The name of the new collection.

collectionDefinition

CollectionDefinition

Settings for the collection, including vector options, the default ID format, and indexing options.

options

CreateCollectionOptions

Options for the operation, including the keyspace.

documentClass

Class<T>

Working with specialized beans for the collection and not the default Document type.

Use the createCollection command.

Command signature
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": OPTIONS
  }
}'
Name Type Summary

name

string

The name of the new collection.

options

object

Optional. The options for this operation. See the options table for more details.

Properties of options:
Name Type Summary

defaultId

object

Optional. Specifies the default ID type for documents in the collection. This is used when you insert a document without an _id field.

Can be one of:

  • {"type": "objectId"}: Each autogenerated _id value is an objectId as provided by the bson library.

  • {"type": "uuidv7"}: Each autogenerated _id value is a version 7 UUID. This is designed as a replacement for version 1 time UUID, and it is recommended for use in new systems.

  • {"type": "uuidv6"}: Each autogenerated _id value is a version 6 UUID. This is field-compatible with version 1 time UUIDs, and it supports lexicographical sorting.

  • {"type": "uuid"}: Each autogenerated _id value is a version 4 UUID. This type is analogous to the uuid type and functions in Apache Cassandra®.

See the example for usage.

For more information, see Document IDs.

Default: Each autogenerated _id value is a string form of a version 4 UUID

vector

object

Optional. The vector configuration for the collection. This includes things like the vector dimension and similarity metric. This also includes settings for server-side embedding generation if you want your collection to have vectorize enabled.

Required for vector search and hybrid search.

The vector object contains the following properties:

  • dimension (int): The dimension for vector embeddings in the collection. If you’re not sure what dimension to set, use the dimension vector your embeddings model produces. Optional if you specify a vector.service.modelName value that has a default dimension value.

  • metric (string): The similarity metric to use for vector search. Can be one of: cosine (default), dot_product, euclidean.

  • service (object): Optional. The configuration for a vectorize embedding provider integration. This lets your collection use vectorize to automatically generate embeddings. Use findEmbeddingProviders or see the documentation for your embedding provider integration to determine what values to specify.

    The service object contains the following properties:

    • provider (string): The name of the vectorize embedding provider.

    • modelName (string): A valid model name for the specified vectorize embedding provider.

    • authentication (string): Optional depending on your provider. Use credentials stored in Astra DB KMS to authenticate with your vectorize embedding provider. In options.vector.service.authentication.providerKey, provide the credential’s API Key name as given in Astra DB KMS. Alternatively, you can omit the authentication object, and then provide the authentication key in an x-embedding-api-key header instead. If you use header authentication, you must provide the x-embedding-api-key header with every command that requires vectorize for this collection, including inserting data and vector search with vectorize.

    • parameters (object): Optional depending on your provider. Additional parameters required for your embedding provider

lexical

object

Optional. The lexical search configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The lexical object has the following properties:

  • enabled (boolean): Whether to enable lexical search for the collection. Use this to disable lexical search for the collection. Required to support hybrid search.

  • analyzer: A string describing a built-in analyzer, or a JSON object describing an analyzer configuration.

    Strings must be one of the supported built-in analyzers.

    JSON objects must follow the specifications in Find data with CQL analyzers.

See the example for usage.

Default: An object with an enabled value of true and an analyzer value of "standard", which corresponds to the standard Apache Lucene™ analyzer.

rerank

object

Optional. The reranker configuration for the collection.

Only collections in databases in the AWS us-east-2 region support this parameter.

The rerank object has the following properties:

  • enabled (boolean): Whether to enable reranking for the collection. Use this to disable reranking and hybrid search for the collection. Required to support hybrid search.

  • service (object): Describes the provider and model name for a reranker model. The service object contains the following properties:

    • provider (string): The name of the reranking provider. Only Nvidia is supported.

    • modelName (string): The name of a reranking model supported by the reranking provider. Only nvidia/llama-3.2-nv-rerankqa-1b-v2 is supported.

See the example for usage.

Default: An object with an enabled value of true and a service value corresponding to the NVIDIA llama-3.2-nv-rerankqa-1b-v2 reranking model. This means that reranking is enabled by default.

indexing

object

Optional. Configures selective indexing for data inserted to the collection.

The indexing object must contain one of:

* allow (array): The properties to index. Must contain at least one property. "allow": [""] indexes all properties, which is the same as the default behavior.

* deny (array): The properties to not index. Must contain at least one property. "deny": [""] means that no properties are indexed.

See the example to specify which fields to index and the example to specify which fields to not index for usage.

Default: All fields of all documents.

Examples

The following examples demonstrate how to create a collection.

Create a collection that is not vector-enabled

  • Python

  • TypeScript

  • Java

  • curl

from astrapy import DataAPIClient

# Get a database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection = database.create_collection("COLLECTION_NAME")
  • Typed collections

  • Untyped collections

You can manually define a client-side type for your collection to help statically catch errors.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Define the type for the collection
interface User {
  name: string,
  age?: number,
}

// Create a collection
(async function () {
  const collection = await database.createCollection<User>("COLLECTION_NAME");
})();

If you don’t pass a type parameter, the collection remains untyped. This is a more flexible but less type-safe option.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Create a collection
(async function () {
  const collection = await database.createCollection("COLLECTION_NAME");
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.documents.Document;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        Collection<Document> collection = database.createCollection("COLLECTION_NAME");
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {}
  }
}'

Create a collection can store vector embeddings

Collections that are vector-enabled can store vector embeddings and work with vector search.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CollectionDefinition, CollectionVectorOptions

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    vector=CollectionVectorOptions(
        dimension=1024,
        metric=VectorMetric.COSINE,
    ),
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition
from astrapy.constants import VectorMetric

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

collection_definition = (
    CollectionDefinition.builder()
    .set_vector_dimension(1024)
    .set_vector_metric(VectorMetric.COSINE)
    .build()
)

collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
  • Typed collections

  • Untyped collections

You can manually define a client-side type for your collection to help statically catch errors.

You can define $vector as an inline field in your interfaces, or you can extend the utility VectorDoc type provided by the client.

import { DataAPIClient, VectorDoc } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Define the type for the collection
interface User extends VectorDoc {
  name: string,
  age?: number,
}

(async function () {
  const collection = await database.createCollection<User>("COLLECTION_NAME", {
    vector: {
      dimension: 1024,
      metric: "cosine",
    },
  });
})();

If you don’t pass a type parameter, the collection remains untyped. This is a more flexible but less type-safe option.

The $vector field must still be number[] or DataAPIVector, or type-related issues will occur.

Consider using a type like VectorDoc & SomeDoc which allows the documents to remain untyped, but still statically requires the $vector field to have the correct type.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    vector: {
      dimension: 1024,
      metric: "cosine",
    },
  });
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.vector.SimilarityMetric;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        CollectionDefinition collectionDefinition = new CollectionDefinition()
            .vectorDimension(1024)
            .vectorSimilarity(SimilarityMetric.COSINE);

        Collection<Document> collection = database.createCollection("COLLECTION_NAME", collectionDefinition);
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "vector": {
        "dimension": 1024,
        "metric": "cosine"
      }
    }
  }
}'

Create a collection that can automatically generate vector embeddings

If you want to automatically generate vector embeddings, create a vector-enabled collection and configure an embedding provider integration for the collection.

The configuration depends on the embedding provider. For the configuration and an example for each provider, see Supported embedding providers.

You can also store pre-generated vector embeddings in the collection. If you store pre-generated and automatically generated embeddings in the same collection, make sure all embeddings have the same provider, model, and dimensions. Mismatched embeddings can cause inaccurate vector searches.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CollectionDefinition,
    CollectionVectorOptions,
    VectorServiceOptions,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    vector=CollectionVectorOptions(
        metric=VectorMetric.SIMILARITY_METRIC,
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="PROVIDER",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
            parameters=PARAMETERS,
        )
    )
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition
from astrapy.constants import VectorMetric

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = (
    CollectionDefinition.builder()
    .set_vector_dimension(MODEL_DIMENSIONS)
    .set_vector_metric(VectorMetric.SIMILARITY_METRIC)
    .set_vector_service(
        provider="PROVIDER",
        model_name="MODEL_NAME",
        authentication={
            "providerKey": "API_KEY_NAME",
        },
        parameters=PARAMETERS,
    )
    .build()
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
  • Typed collections

  • Untyped collections

You can manually define a client-side type for your collection to help statically catch errors.

You can define $vector and $vectorize as inlines fields in your interfaces, or you can extend the utility VectorizeDoc types provided by the client.

import { DataAPIClient, VectorizeDoc } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Define the type for the collection
interface User extends VectorizeDoc {
  name: string,
  age?: number,
}

(async function () {
  const collection = await database.createCollection<User>("COLLECTION_NAME", {
    vector: {
      dimension: MODEL_DIMENSIONS,
      metric: "SIMILARITY_METRIC",
      service: {
        provider: "PROVIDER",
        modelName: "MODEL_NAME",
        authentication: {
          providerKey: "API_KEY_NAME",
        },
        parameters: PARAMETERS,
      },
    },
  });
})();

If you don’t pass a type parameter, the collection remains untyped. This is a more flexible but less type-safe option.

The $vector field must still be number[] or DataAPIVector, and the $vectorize field must still be a string, or type-related issues will occur.

Consider using a type like VectorizeDoc & SomeDoc which allows the documents to remain untyped, but still statically requires the $vector and $vectorize fields to have the correct type.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    vector: {
      dimension: MODEL_DIMENSIONS,
      metric: "SIMILARITY_METRIC",
      service: {
        provider: "PROVIDER",
        modelName: "MODEL_NAME",
        authentication: {
          providerKey: "API_KEY_NAME",
        },
        parameters: PARAMETERS,
      },
    },
  });
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.vector.SimilarityMetric;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        CollectionDefinition collectionDefinition = new CollectionDefinition()
            .vectorDimension(MODEL_DIMENSIONS)
            .vectorSimilarity(SimilarityMetric.SIMILARITY_METRIC)
            .vectorize(
                "PROVIDER",
                "MODEL_NAME",
                "API_KEY_NAME",
                PARAMETERS
            );

        Collection<Document> collection = database.createCollection("COLLECTION_NAME", collectionDefinition);
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "vector": {
        "dimension": MODEL_DIMENSIONS,
        "metric": "SIMILARITY_METRIC",
        "service": {
          "provider": "PROVIDER",
          "modelName": "MODEL_NAME",
          "authentication": {
            "providerKey": "API_KEY_NAME"
          },
          "parameters": PARAMETERS
        }
      }
    }
  }
}'

Create a collection that supports hybrid search

If you want to perform hybrid search on your collection, you must create a collection that has vector, lexical, and rerank enabled. Your collection must also be in a database in the AWS us-east-2 region.

Lexical and rerank are enabled by default when you create a collection in a database in the AWS us-east-2 region, but you can optionally configure the lexical analyzer and the reranker model.

For configuration details about the lexical analyzer, see Find data with CQL analyzers. The following example uses a configuration suitable for English text.

For configuration details about the reranker model, inspect the available reranker models. Only the NVIDIA llama-3.2-nv-rerankqa-1b-v2 reranking model reranker model is supported.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CollectionDefinition,
    CollectionLexicalOptions,
    CollectionRerankOptions,
    CollectionVectorOptions,
    RerankServiceOptions,
    VectorServiceOptions,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    vector=CollectionVectorOptions(
        metric=VectorMetric.COSINE,
        dimension=1024,
        service=VectorServiceOptions(
            provider="nvidia",
            model_name="NV-Embed-QA",
        )
    ),
    lexical=CollectionLexicalOptions(
        analyzer={
            "tokenizer": {
                "name": "standard",
                "args": {}
            },
            "filters": [
                {
                   "name": "lowercase"
                },
                {
                   "name": "stop"
                },
                {
                   "name": "porterstem"
                },
                {
                   "name": "asciifolding"
                }
            ],
            "charFilters": []
        },
        enabled=True,
    ),
    rerank=CollectionRerankOptions(
        enabled=True,
        service=RerankServiceOptions(
            provider="nvidia",
            model_name="nvidia/llama-3.2-nv-rerankqa-1b-v2",
        ),
    ),
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition
from astrapy.constants import VectorMetric

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = (
    CollectionDefinition.builder()
    .set_vector_dimension(1024)
    .set_vector_metric(VectorMetric.COSINE)
    .set_vector_service(
        provider="nvidia",
        model_name="NV-Embed-QA",
    )
    .set_lexical(
        {
            "tokenizer": {
                "name": "standard",
                "args": {}
            },
            "filters": [
                {
                   "name": "lowercase"
                },
                {
                   "name": "stop"
                },
                {
                   "name": "porterstem"
                },
                {
                   "name": "asciifolding"
                }
            ],
            "charFilters": []
        }
    )
    .set_rerank("nvidia", "nvidia/llama-3.2-nv-rerankqa-1b-v2")
    .build()
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
  • Typed collections

  • Untyped collections

You can manually define a client-side type for your collection to help statically catch errors.

You can define $vector, $vectorize, and $lexical as inlines fields in your interfaces, or you can extend the utility VectorDoc, VectorizeDoc, and LexicalDoc types provided by the client.

import { DataAPIClient, LexicalDoc, VectorizeDoc } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Define the type for the collection
interface User extends VectorizeDoc, LexicalDoc {
  name: string,
  age?: number,
}

(async function () {
  const collection = await database.createCollection<User>("COLLECTION_NAME", {
    vector: {
      dimension: 1024,
      metric: "cosine",
      service: {
          provider: "nvidia",
          modelName: "NV-Embed-QA",
      },
    },
    lexical: {
      enabled: true,
      analyzer: {
        tokenizer: {
          name: "standard",
          args: {}
        },
        filters: [
          {
            name: "lowercase"
          },
          {
            name: "stop"
          },
          {
            name: "porterstem"
          },
          {
            name: "asciifolding"
          }
        ],
        charFilters: []
        },
    },
    rerank: {
      enabled: true,
      service: {
        provider: "nvidia",
        modelName: "nvidia/llama-3.2-nv-rerankqa-1b-v2",
      },
    },
  });
})();

If you don’t pass a type parameter, the collection remains untyped. This is a more flexible but less type-safe option.

The $vector field must still be number[] or DataAPIVector, and the $vectorize and $lexical fields must still be a string, or type-related issues will occur.

Consider using a type like VectorDoc & LexicalDoc & SomeDoc or VectorizeDoc & LexicalDoc & SomeDoc which allows the documents to remain untyped, but still statically requires the $vector, $vectorize, and $lexical to have the correct type.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    vector: {
      dimension: 1024,
      metric: "cosine",
      service: {
          provider: "nvidia",
          modelName: "NV-Embed-QA",
      },
    },
    lexical: {
      enabled: true,
      analyzer: {
        tokenizer: {
          name: "standard",
          args: {}
        },
        filters: [
          {
            name: "lowercase"
          },
          {
            name: "stop"
          },
          {
            name: "porterstem"
          },
          {
            name: "asciifolding"
          }
        ],
        charFilters: []
        },
    },
    rerank: {
      enabled: true,
      service: {
        provider: "nvidia",
        modelName: "nvidia/llama-3.2-nv-rerankqa-1b-v2",
      },
    },
  });
})();

The Java client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

package com.examples;

import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.core.lexical.Analyzer;
import com.datastax.astra.client.core.lexical.LexicalOptions;
import com.datastax.astra.client.core.rerank.CollectionRerankOptions;
import com.datastax.astra.client.core.rerank.RerankServiceOptions;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vector.VectorOptions;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;

import static com.datastax.astra.client.core.lexical.AnalyzerTypes.STANDARD;

public class CreateCollection {

  public static void main(String[] args) {
   // Get a database
   Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
     .getDatabase("ASTRA_DB_API_ENDPOINT");

   // Create a collection
   CollectionDefinition collectionDefinition = new CollectionDefinition();

   // Vector Options
   VectorServiceOptions vectorService = new VectorServiceOptions()
     .provider( "nvidia")
     .modelName("NV-Embed-QA");
   VectorOptions vectorOptions = new VectorOptions()
     .dimension(1024)
     .metric(SimilarityMetric.COSINE.getValue())
     .service(vectorService);
    collectionDefinition.vector(vectorOptions);

    // Lexical Options
    Analyzer analyzer = new Analyzer()
      .tokenizer(STANDARD.getValue())
      .addFilter("lowercase")
      .addFilter("stop")
      .addFilter("porterstem")
      .addFilter("asciifolding");
    LexicalOptions lexicalOptions = new LexicalOptions()
      .enabled(true)
      .analyzer(analyzer);
    collectionDefinition.lexical(lexicalOptions);

    // Rerank Options
   RerankServiceOptions rerankService = new RerankServiceOptions()
      .modelName("nvidia/llama-3.2-nv-rerankqa-1b-v2")
      .provider("nvidia");
   CollectionRerankOptions rerankOptions = new CollectionRerankOptions()
      .enabled(true)
      .service(rerankService);
   collectionDefinition.rerank(rerankOptions);

   database.createCollection("COLLECTION_NAME", collectionDefinition);
  }
}
package com.examples;

import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.core.lexical.Analyzer;
import com.datastax.astra.client.core.lexical.LexicalOptions;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;

import static com.datastax.astra.client.core.lexical.AnalyzerTypes.STANDARD;

public class CreateCollection {

  public static void main(String[] args) {
   // Get a database
   Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
     .getDatabase("ASTRA_DB_API_ENDPOINT");

    database.createCollection(
        "COLLECTION_NAME",
        new CollectionDefinition()
            .vector(1024, SimilarityMetric.COSINE)
            .vectorize("nvidia", "NV-Embed-QA")
            .lexical(
                new LexicalOptions()
                    .enabled(true)
                    .analyzer(
                        new Analyzer()
                            .tokenizer(STANDARD.getValue())
                            .addFilter("lowercase")
                            .addFilter("stop")
                            .addFilter("porterstem")
                            .addFilter("asciifolding")))
            .rerank("nvidia", "nvidia/llama-3.2-nv-rerankqa-1b-v2"));
     }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "lexical": {
        "analyzer": {
            "tokenizer": {
                "name": "standard",
                "args": {}
            },
            "filters": [
                {
                   "name": "lowercase"
                },
                {
                   "name": "stop"
                },
                {
                   "name": "porterstem"
                },
                {
                   "name": "asciifolding"
                }
            ],
            "charFilters": []
        },
        "enabled": true
      },
      "rerank": {
        "enabled": true,
        "service": {
          "modelName": "nvidia/llama-3.2-nv-rerankqa-1b-v2",
          "provider": "nvidia"
        }
      },
      "vector": {
        "dimension": 1024,
        "metric": "cosine",
        "service": {
          "provider": "nvidia",
          "modelName": "NV-Embed-QA"
        }
      }
    }
  }
}'

Create a collection and specify the default ID format

For more information about the default ID format, see Document IDs. For allowed values, see the Parameters.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.info import (
    CollectionDefinition,
    CollectionDefaultIDOptions,
)
from astrapy.constants import DefaultIdType

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    default_id=CollectionDefaultIDOptions(DefaultIdType.OBJECTID),
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition
from astrapy.constants import DefaultIdType

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = (
    CollectionDefinition.builder()
    .set_default_id(DefaultIdType.OBJECTID)
    .build()
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
  • Typed collections

  • Untyped collections

You can manually define a client-side type for your collection to help statically catch errors.

The _id field type should match the defaultId type.

import { DataAPIClient, ObjectId } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

// Define the type for the collection
interface User {
  _id: ObjectId,
  name: string,
  age?: number,
}

(async function () {
  const collection = await database.createCollection<User>("COLLECTION_NAME", {
    defaultId: {
      type: "objectId",
    },
  });
})();

If you don’t pass a type parameter, the collection remains untyped. This is a more flexible but less type-safe option.

However, if you later specify _id when you insert a document, DataStax recommends that it has the same type as the defaultId.

Consider using a type like { id: ObjectId } & SomeDoc which allows the documents to remain untyped, but still statically requires the _id field to have the correct type.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    defaultId: {
      type: "objectId",
    },
  });
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.CollectionDefaultIdTypes;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.collections.definition.documents.Document;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        CollectionDefinition collectionDefinition = new CollectionDefinition()
            .defaultId(CollectionDefaultIdTypes.OBJECT_ID);

        Collection<Document> collection = database.createCollection("COLLECTION_NAME", collectionDefinition);
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "defaultId": {
        "type": "uuidv7"
      }
    }
  }
}'

Create a collection and specify which fields to index

For more information about selective indexing, see Indexes in collections.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    indexing={"allow": ["city", "country"]},
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = (
    CollectionDefinition.builder()
    .set_indexing("allow", ["city", "country"])
    .build()
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    indexing: {
      allow: ["city", "country"],
    },
  });
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.CollectionDefinition;
import com.datastax.astra.client.collections.definition.documents.Document;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        CollectionDefinition collectionDefinition = new CollectionDefinition()
            .indexingAllow("city", "country");

        Collection<Document> collection = database.createCollection("COLLECTION_NAME", collectionDefinition);
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "indexing": {
        "allow": ["city", "country"]
      }
    }
  }
}'

Create a collection and specify which fields shouldn’t be indexed

For more information about selective indexing, see Indexes in collections.

  • Python

  • TypeScript

  • Java

  • curl

The Python client supports multiple ways to create a collection:

  • You can define the collection parameters in a CollectionDefinition object and then create the collection from the CollectionDefinition object.

  • You can use a fluent interface to build the collection definition and then create the collection from the definition.

  • CollectionDefinition object

  • Fluent interface

from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = CollectionDefinition(
    indexing={"deny": ["city", "country"]},
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
from astrapy import DataAPIClient
from astrapy.info import CollectionDefinition

# Get an existing database
client = DataAPIClient()
database = client.get_database(
    "ASTRA_DB_API_ENDPOINT",
    token="ASTRA_DB_APPLICATION_TOKEN",
)

# Create a collection
collection_definition = (
    CollectionDefinition.builder()
    .set_indexing("deny", ["city", "country"])
    .build()
)
collection = database.create_collection(
    "COLLECTION_NAME",
    definition=collection_definition,
)
import { DataAPIClient } from "@datastax/astra-db-ts";

// Get a database
const client = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN");
const database = client.db("ASTRA_DB_API_ENDPOINT");

(async function () {
  const collection = await database.createCollection("COLLECTION_NAME", {
    indexing: {
      deny: ["city", "country"],
    },
  });
})();
package com.examples;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.collections.CollectionDefaultIdTypes;

public class CreateCollection {

    public static void main(String[] args) {
        // Get a database
        Database database = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
            .getDatabase("ASTRA_DB_API_ENDPOINT");

        // Create a collection
        CollectionDefinition collectionDefinition = new CollectionDefinition()
            .indexingDeny("city", "country");

        Collection<Document> collection = database.createCollection("COLLECTION_NAME", collectionDefinition);
    }
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createCollection": {
    "name": "COLLECTION_NAME",
    "options": {
      "indexing": {
        "deny": ["city", "country"]
      }
    }
  }
}'

Client reference

  • Python

  • TypeScript

  • Java

  • curl

For more information, see the client reference.

For more information, see the client reference.

For more information, see the client reference.

Client reference documentation is not applicable for HTTP.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com