Create a table

Creates a new table in a keyspace in a database.

After you create a table, index columns that you want to sort or filter. This optimizes your queries and avoids resource intensive, long running allow filtering operations. All indexed column names must use snake case, not camel case.

You can also modify the table columns later. To add data to your table, insert rows.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

Python
TypeScript
Java
curl

Creates a table with the specified parameters.

Returns a Table object. You can use this object to work with rows in the table.

Unless you specify the row_type parameter, the table is typed as Table[dict]. For more information, see Typing support.

Creates a table with the specified parameters.

Returns a promise that resolves to a <Table<Schema, PKey>> object. You can use this object to work with rows in the table.

Unless you specify the Schema, the table is typed as Table<Record<string, any>>.

Creates a table with the specified parameters.

Returns a Table<T> object. You can use this object to work with rows in the table.

Unless you specify the rowClass parameter, the table is typed as Table<Row>.

Creates a table with the specified parameters.

If the command succeeds, the response indicates the success.

Example successful response:

{
  "status": {
    "ok": 1
  }
}

Parameters

Python
TypeScript
Java
curl

Use the create_table method, which belongs to the astrapy.Database class.

Method signature

create_table(
  name: str,
  *,
  definition: CreateTableDefinition | dict[str, Any],
  row_type: type[Any],
  keyspace: str,
  if_not_exists: bool,
  table_admin_timeout_ms: int,
  request_timeout_ms: int,
  timeout_ms: int,
  embedding_api_key: str | EmbeddingHeadersProvider,
  spawn_api_options: APIOptions,
) -> Table[ROW]

Name Type Summary

Name	Type	Summary
`name`	`str`	The name of the table. Table names must follow these rules: Can contain letters, numbers, and underscores Cannot exceed 48 characters Must be unique within the keyspace
`definition`	`CreateTableDefinition` \| `dict`	The full schema for the table, including column names, column data types, and the primary key. See the examples for usage. All column names used in the schema must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name.
`row_type`	`type`	Optional. A formal specifier for the type checker. If provided, `row_type` must match the type hint specified in the assignment. For more information, see Typing support. Default: `Table[dict]`
`keyspace`	`str`	Optional. The keyspace in which to create the table. For an example, see Create a table and specify the keyspace.. Default: The working keyspace for the database.
`if_not_exists`	`bool`	Optional. Whether the command should silently succeed even if a table with the given name already exists in the keyspace and no new table was created. This option only checks table names. It does not check table schemas. Default: false
`embedding_api_key`	`str` \| `EmbeddingHeadersProvider`	Optional. This only applies to tables that have a vector column with a vectorize embedding provider integration. Use this option to provide the embedding provider API key directly with headers instead of using the API key in the Astra DB KMS. The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize integration is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize. Most vectorize integrations accept a plain string for header authentication. However, some vectorize integrations and models require specialized subclasses of `EmbeddingHeadersProvider`, such as `AWSEmbeddingHeadersProvider`, for header authentication. You can use this authentication method only if all affected columns use the same embedding provider.
`spawn_api_options`	`APIOptions`	Optional. A complete or partial specification of the APIOptions to override the defaults inherited from the `Database`. Use this to customize the interaction of the Python client with the collection. For example, you can change the serialization/deserialization options or default timeouts. If `APIOptions` is passed together with a named parameter such as a timeout, the latter takes precedence over the corresponding `spawn_api_options` setting.
`table_admin_timeout_ms`	`int`	Optional. A timeout, in milliseconds, for the underlying HTTP request. If not provided, the `Database` setting is used. This parameter is aliased as `request_timeout_ms` and `timeout_ms` for convenience.

name

str

The name of the table.

Table names must follow these rules:

Can contain letters, numbers, and underscores
Cannot exceed 48 characters
Must be unique within the keyspace

definition

CreateTableDefinition | dict

The full schema for the table, including column names, column data types, and the primary key.

See the examples for usage.

All column names used in the schema must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name.

row_type

type

Optional. A formal specifier for the type checker. If provided, row_type must match the type hint specified in the assignment. For more information, see Typing support.

Default: Table[dict]

keyspace

str

Optional. The keyspace in which to create the table. For an example, see Create a table and specify the keyspace..

Default: The working keyspace for the database.

if_not_exists

bool

Optional. Whether the command should silently succeed even if a table with the given name already exists in the keyspace and no new table was created.

This option only checks table names. It does not check table schemas.

Default: false

embedding_api_key

str | EmbeddingHeadersProvider

Optional. This only applies to tables that have a vector column with a vectorize embedding provider integration.

Use this option to provide the embedding provider API key directly with headers instead of using the API key in the Astra DB KMS.

The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize integration is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize.

Most vectorize integrations accept a plain string for header authentication. However, some vectorize integrations and models require specialized subclasses of EmbeddingHeadersProvider, such as AWSEmbeddingHeadersProvider, for header authentication.

You can use this authentication method only if all affected columns use the same embedding provider.

spawn_api_options

APIOptions

Optional. A complete or partial specification of the APIOptions to override the defaults inherited from the Database. Use this to customize the interaction of the Python client with the collection. For example, you can change the serialization/deserialization options or default timeouts.

If APIOptions is passed together with a named parameter such as a timeout, the latter takes precedence over the corresponding spawn_api_options setting.

table_admin_timeout_ms

int

Optional. A timeout, in milliseconds, for the underlying HTTP request. If not provided, the Database setting is used. This parameter is aliased as request_timeout_ms and timeout_ms for convenience.

Use the createTable method, which belongs to the Db class.

Method signature

async createTable<const Def extends CreateTableDefinition>(
  name: string,
  options: {
    definition: CreateTableDefinition,
    ifNotExists?: boolean,
    embeddingApiKey?: string | EmbeddingHeadersProvider,
    logging?: DataAPILoggingConfig,
    serdes?: TableSerDesConfig,
    timeoutDefaults?: Partial<TimeoutDescriptor>,
    keyspace?: string,
  }
): Table<InferTableSchema<Def>, InferTablePrimaryKey<Def>>

Parameters:

Name Type Summary

Name	Type	Summary
`name`	`string`	The name of the table. Table names must follow these rules: Can contain letters, numbers, and underscores Cannot exceed 48 characters Must be unique within the keyspace
`options`	`CreateTableOptions`	The options for this operation. See Properties of `options` for more details.

name

string

The name of the table.

Table names must follow these rules:

Can contain letters, numbers, and underscores
Cannot exceed 48 characters
Must be unique within the keyspace

options

CreateTableOptions

The options for this operation. See Properties of options for more details.

Properties of `options`
Name	Type	Summary
`definition`	`CreateTableDefinition`	The full schema for the table, including column names, column data types, and the primary key. See the examples for usage. All column names used in the schema must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name.
`ifNotExists`	`boolean`	Optional. Whether the command should silently succeed even if a table with the given name already exists in the keyspace and no new table was created. This option only checks table names. It does not check table schemas. Default: false
`keyspace`	`string`	Optional. The keyspace in which to create the table. For an example, see Create a table and specify the keyspace.. Default: The working keyspace for the database.
`embeddingApiKey`	`string` \| `EmbeddingHeadersProvider`	Optional. This only applies to tables that have a vector column with a vectorize embedding provider integration. Use this option to provide the embedding provider API key directly with headers instead of using the API key in the Astra DB KMS. The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize integration is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize. You can use this authentication method only if all affected columns use the same embedding provider.
`logging`	`DataAPILoggingConfig`	Optional. The configuration for logging events emitted by the `DataAPIClient`. For more information, see Logging.
`timeoutDefaults`	`Partial<TimeoutDescriptor>`	Optional. The default timeout options for any operation performed on this `Table` instance. For more information, see TimeoutDescriptor.
`serdes`	`TableSerDesConfig`	Optional. Lower-level serialization/deserialization configuration for this table. For more information, see Custom Ser/Des.

Use the createTable method, which belongs to the com.datastax.astra.client.databases.Database class.

Method signature

<T> Table<T> createTable(
  String tableName,
  TableDefinition tableDefinition,
  Class<T> rowClass,
  CreateTableOptions createTableOptions
)

<T> Table<T> createTable(
  String tableName,
  TableDefinition tableDefinition,
  Class<T> rowClass
)

<T> Table<T> createTable(Class<T> rowClass)

<T> Table<T> createTable(
  Class<T> rowClass,
  CreateTableOptions createTableOptions
)

<T> Table<T> createTable(
  String tableName,
  Class<T> rowClass,
  CreateTableOptions createTableOptions
)

Table<Row> createTable(
  String tableName,
  TableDefinition tableDefinition,
  CreateTableOptions options
)

Table<Row> createTable(
  String tableName,
  TableDefinition tableDefinition
)

Name Type Summary

Name	Type	Summary
`name`	`String`	The name of the table. Table names must follow these rules: Can contain letters, numbers, and underscores Cannot exceed 48 characters Must be unique within the keyspace
`definition`	`TableDefinition`	The full schema for the table, including column names, column data types, and the primary key. See the examples for usage. All column names used in the schema must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name.
`rowClass`	`Class<?>`	Optional. A specification of the class of the table’s row object. Default: `Row`, which is close to a `Map` object
`createTableOptions`	`CreateTableOptions`	Optional. The options for this operation. See Methods of `CreateTableOptions` for more details.

name

String

The name of the table.

Table names must follow these rules:

Can contain letters, numbers, and underscores
Cannot exceed 48 characters
Must be unique within the keyspace

definition

TableDefinition

The full schema for the table, including column names, column data types, and the primary key.

See the examples for usage.

All column names used in the schema must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name.

rowClass

Class<?>

Optional. A specification of the class of the table’s row object.

Default: Row, which is close to a Map object

createTableOptions

CreateTableOptions

Optional. The options for this operation. See Methods of CreateTableOptions for more details.

Methods of `CreateTableOptions`
Method	Parameters	Summary
`ifNotExists()`	`boolean`	Optional. Whether the command should silently succeed even if a table with the given name already exists in the keyspace and no new table was created. This option only checks table names. It does not check table schemas. Default: false
`embeddingAuthProvider()`	`EmbeddingHeadersProvider`	Optional. This only applies to tables that have a vector column with a vectorize embedding provider integration. Use this option to provide the embedding provider API key directly with headers instead of using the API key in the Astra DB KMS. The API key is sent to the Data API for every operation on the collection. It is useful when a vectorize integration is configured but no credentials are stored, or when you want to override the stored credentials. For more information, see Auto-generate embeddings with vectorize. Most vectorize integrations accept a plain string for header authentication. However, some vectorize integrations and models require specialized subclasses of `EmbeddingHeadersProvider` for header authentication. You can use this authentication method only if all affected columns use the same embedding provider.
`timeout()`	`long` \| `Duration`	Optional. A timeout, in milliseconds, for the underlying HTTP request. If not provided, the `Database` setting is used.

Use the createTable command.

Command signature

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        "COLUMN_NAME": "DATA_TYPE",
        "COLUMN_NAME": "DATA_TYPE"
      },
      "primaryKey": "PRIMARY_KEY_DEFINITION"
    }
  }
}'

Name Type Summary

Name	Type	Summary
`name`	`string`	The name of the table. Table names must follow these rules: Can contain letters, numbers, and underscores Cannot exceed 48 characters Must be unique within the keyspace
`definition`	`object`	The full schema for the table, including column names, column data types, and the primary key. See the examples for usage. See Properties of `definition` for more details.

name

string

The name of the table.

Table names must follow these rules:

Can contain letters, numbers, and underscores
Cannot exceed 48 characters
Must be unique within the keyspace

definition

object

The full schema for the table, including column names, column data types, and the primary key.

See the examples for usage. See Properties of definition for more details.

Properties of `definition`
Name	Type	Summary
`columns`	`object`	The column names and data types. All column names must be unique within the table. Any columns that will be indexes must use snake case, not camel case, for their name. See the examples for usage.
`primaryKey`	`string` \| `object`	The primary key for the table. See the examples for usage.

Examples

The following examples demonstrate how to create a table.

Create a table with a single-column primary key

A single-column primary key is a primary key consisting of one column. For more information, see Primary keys in tables.

Python
TypeScript
Java
curl

The Python client supports multiple ways to create a table. In all cases, you must define the table schema, and then pass the definition to the create_table method.

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

CreateTableDefinition object
Fluent interface
Dictionary

You can define the table as a CreateTableDefinition and then build the table from the CreateTableDefinition object.

from astrapy import DataAPIClient
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableKeyValuedColumnType,
    TableKeyValuedColumnTypeDescriptor,
    TableScalarColumnTypeDescriptor,
    TableValuedColumnTypeDescriptor,
    TableValuedColumnType,
    TablePrimaryKeyDescriptor,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "title": TableScalarColumnTypeDescriptor(column_type=ColumnType.TEXT),
        "number_of_pages": TableScalarColumnTypeDescriptor(column_type=ColumnType.INT),
        "rating": TableScalarColumnTypeDescriptor(column_type=ColumnType.FLOAT),
        "genres": TableValuedColumnTypeDescriptor(
            column_type=TableValuedColumnType.SET,
            value_type=ColumnType.TEXT,
        ),
        "metadata": TableKeyValuedColumnTypeDescriptor(
            column_type=TableKeyValuedColumnType.MAP,
            key_type=ColumnType.TEXT,
            value_type=ColumnType.TEXT,
        ),
        "is_checked_out": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.BOOLEAN
        ),
        "due_date": TableScalarColumnTypeDescriptor(column_type=ColumnType.DATE),
    },
    # Define the primary key for the table.
    # In this case, the table uses a single-column primary key.
    primary_key=TablePrimaryKeyDescriptor(partition_by=["title"], partition_sort={}),
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can use a fluent interface to build the table definition and then create the table from the definition.

from astrapy import DataAPIClient
from astrapy.info import CreateTableDefinition, ColumnType

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = (
    CreateTableDefinition.builder()
    # Define all of the columns in the table
    .add_column("title", ColumnType.TEXT)
    .add_column("number_of_pages", ColumnType.INT)
    .add_column("rating", ColumnType.FLOAT)
    .add_set_column(
        "genres",
        ColumnType.TEXT,
    )
    .add_map_column(
        "metadata",
        # This is the key type for the map column
        ColumnType.TEXT,
        # This is the value type for the map column
        ColumnType.TEXT,
    )
    .add_column("is_checked_out", ColumnType.BOOLEAN)
    .add_column("due_date", ColumnType.DATE)
    # Define the primary key for the table.
    # In this case, the table uses a single-column primary key.
    .add_partition_by(["title"])
    # Finally, build the table definition.
    .build()
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can define the table as a dictionary and then build the table from the dictionary.

from astrapy import DataAPIClient

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        "title": {"type": "text"},
        "number_of_pages": {"type": "int"},
        "rating": {"type": "float"},
        "genres": {"type": "set", "valueType": "text"},
        "metadata": {"type": "map", "keyType": "text", "valueType": "text"},
        "is_checked_out": {"type": "boolean"},
        "due_date": {"type": "date"},
    },
    "primaryKey": {
        "partitionBy": ["title"],
        "partitionSort": {},
    },
}

table = database.create_table(
    "example_table",
    definition=table_definition,
)

The TypeScript client supports multiple ways to create a table. The method you choose depends on your typing preferences and whether you modified the ser/des configuration.

For more information, see Collection and table typing.

Automatic type inference
Manually typed tables
Untyped tables

The TypeScript client can automatically infer the TypeScript-equivalent type of the table’s schema and primary key.

To do this, first create the table definition. Then, use InferTableSchema and InferTablePrimaryKey to infer the type of the table and of the primary key. To create the table, provide the table definition and the inferred types to the createTable method.

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["title"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

(async function () {
  // Provide the types and the definition
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

You can manually define the type for your table’s schema and primary key. To create the table, provide the table definition and the types to the createTable method.

This may be necessary if you modify the table’s default ser/des configuration.

import { DataAPIClient, DataAPIDate, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["title"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  title: string;
  number_of_pages?: number | null | undefined;
  rating?: number | null | undefined;
  genres?: Set<string> | undefined;
  metadata?: Map<string, string> | undefined;
  is_checked_out?: boolean | null | undefined;
  due_date?: DataAPIDate | null | undefined;
};

type TablePrimaryKey = Pick<TableSchema, "title">;

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

To create a table without any typing, pass SomeRow as the single generic type parameter to the createTable method. This types the table’s rows as Record<string, any>.

This is the most flexible but least type-safe option.

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["title"],
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
  });
})();

The Java client supports multiple ways to create a table. In all cases, you must define the table schema.

Use a generic type
Define the row type

If you don’t specify the Class parameter when creating an instance of the generic class Table, the client defaults Table as the type. In this case, the working object type T is Row.class.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnText("title")
            .addColumnInt("number_of_pages")
            .addColumn("rating", TableColumnTypes.FLOAT)
            .addColumnSet("genres", TableColumnTypes.TEXT)
            .addColumnMap("metadata", TableColumnTypes.TEXT, TableColumnTypes.TEXT)
            .addColumnBoolean("is_checked_out")
            .addColumn("due_date", TableColumnTypes.DATE)
            // Define the primary key for the table.
            // In this case, the table uses a single-column primary key.
            .addPartitionBy("title");

    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

Instead of using the default type Row.class, you can define your own working object, which will be serialized as a Row.

This working object can be annotated when the field names do not exactly match the column names or when you want to fully describe your table to enable its creation solely from the entity definition.

The following example defines a Book class and then uses it to create the table.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import java.util.Date;
import java.util.Map;
import java.util.Set;
import lombok.Data;

public class Example {
  @EntityTable("example_table")
  @Data
  public class Book {
    @PartitionBy(0)
    @Column(name = "title", type = TableColumnTypes.TEXT)
    private String title;

    @Column(name = "number_of_pages", type = TableColumnTypes.INT)
    private Integer number_of_pages;

    @Column(name = "rating", type = TableColumnTypes.FLOAT)
    private Float rating;

    @Column(name = "genres", type = TableColumnTypes.SET, valueType = TableColumnTypes.TEXT)
    private Set<String> genres;

    @Column(
        name = "metadata",
        type = TableColumnTypes.MAP,
        keyType = TableColumnTypes.TEXT,
        valueType = TableColumnTypes.TEXT)
    private Map<String, String> metadata;

    @Column(name = "is_checked_out", type = TableColumnTypes.BOOLEAN)
    private Boolean is_checked_out;

    @Column(name = "due_date", type = TableColumnTypes.DATE)
    private Date due_date;
  }

  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    Table<Book> table = database.createTable(Book.class);
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "example_table",
    "definition": {
      "columns": {
        "title": {
          "type": "text"
        },
        "number_of_pages": {
          "type": "int"
        },
        "rating": {
          "type": "float"
        },
        "metadata": {
          "type": "map",
          "keyType": "text",
          "valueType": "text"
        },
        "genres": {
          "type": "set",
          "valueType": "text"
        },
        "is_checked_out": {
          "type": "boolean"
        },
        "due_date": {
          "type": "date"
        }
      },
      "primaryKey": "title"
    }
  }
}'

Create a table with a composite primary key

A composite primary key is a primary key consisting of multiple columns. For more information, see Primary keys in tables.

Python
TypeScript
Java
curl

The Python client supports multiple ways to create a table. In all cases, you must define the table schema, and then pass the definition to the create_table method.

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

CreateTableDefinition object
Fluent interface
Dictionary

You can define the table as a CreateTableDefinition and then build the table from the CreateTableDefinition object.

from astrapy import DataAPIClient
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableKeyValuedColumnType,
    TableKeyValuedColumnTypeDescriptor,
    TableScalarColumnTypeDescriptor,
    TableValuedColumnTypeDescriptor,
    TableValuedColumnType,
    TablePrimaryKeyDescriptor,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "title": TableScalarColumnTypeDescriptor(column_type=ColumnType.TEXT),
        "number_of_pages": TableScalarColumnTypeDescriptor(column_type=ColumnType.INT),
        "rating": TableScalarColumnTypeDescriptor(column_type=ColumnType.FLOAT),
        "genres": TableValuedColumnTypeDescriptor(
            column_type=TableValuedColumnType.SET,
            value_type=ColumnType.TEXT,
        ),
        "metadata": TableKeyValuedColumnTypeDescriptor(
            column_type=TableKeyValuedColumnType.MAP,
            key_type=ColumnType.TEXT,
            value_type=ColumnType.TEXT,
        ),
        "is_checked_out": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.BOOLEAN
        ),
        "due_date": TableScalarColumnTypeDescriptor(column_type=ColumnType.DATE),
    },
    # Define the primary key for the table.
    # In this case, the table uses a composite primary key.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["title", "rating"], partition_sort={}
    ),
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can use a fluent interface to build the table definition and then create the table from the definition.

from astrapy import DataAPIClient
from astrapy.info import CreateTableDefinition, ColumnType

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = (
    CreateTableDefinition.builder()
    # Define all of the columns in the table
    .add_column("title", ColumnType.TEXT)
    .add_column("number_of_pages", ColumnType.INT)
    .add_column("rating", ColumnType.FLOAT)
    .add_set_column(
        "genres",
        ColumnType.TEXT,
    )
    .add_map_column(
        "metadata",
        # This is the key type for the map column
        ColumnType.TEXT,
        # This is the value type for the map column
        ColumnType.TEXT,
    )
    .add_column("is_checked_out", ColumnType.BOOLEAN)
    .add_column("due_date", ColumnType.DATE)
    # Define the primary key for the table.
    # In this case, the table uses a composite primary key.
    .add_partition_by(["title", "rating"])
    # Finally, build the table definition.
    .build()
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can define the table as a dictionary and then build the table from the dictionary.

from astrapy import DataAPIClient

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        "title": {"type": "text"},
        "number_of_pages": {"type": "int"},
        "rating": {"type": "float"},
        "genres": {"type": "set", "valueType": "text"},
        "metadata": {"type": "map", "keyType": "text", "valueType": "text"},
        "is_checked_out": {"type": "boolean"},
        "due_date": {"type": "date"},
    },
    "primaryKey": {
        "partitionBy": ["title", "rating"],
        "partitionSort": {},
    },
}

table = database.create_table(
    "example_table",
    definition=table_definition,
)

The TypeScript client supports multiple ways to create a table. The method you choose depends on your typing preferences and whether you modified the ser/des configuration.

For more information, see Collection and table typing.

Automatic type inference
Manually typed tables
Untyped tables

The TypeScript client can automatically infer the TypeScript-equivalent type of the table’s schema and primary key.

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a composite primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

(async function () {
  // Provide the types and the definition
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

You can manually define the type for your table’s schema and primary key. To create the table, provide the table definition and the types to the createTable method.

This may be necessary if you modify the table’s default ser/des configuration.

import { DataAPIClient, DataAPIDate, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a composite primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  title: string;
  number_of_pages?: number | null | undefined;
  rating?: number | null | undefined;
  genres?: Set<string> | undefined;
  metadata?: Map<string, string> | undefined;
  is_checked_out?: boolean | null | undefined;
  due_date?: DataAPIDate | null | undefined;
};

type TablePrimaryKey = Pick<TableSchema, "title" | "rating">;

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

To create a table without any typing, pass SomeRow as the single generic type parameter to the createTable method. This types the table’s rows as Record<string, any>.

This is the most flexible but least type-safe option.

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a composite primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
  });
})();

The Java client supports multiple ways to create a table. In all cases, you must define the table schema.

Use a generic type
Define the row type

If you don’t specify the Class parameter when creating an instance of the generic class Table, the client defaults Table as the type. In this case, the working object type T is Row.class.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnText("title")
            .addColumnInt("number_of_pages")
            .addColumn("rating", TableColumnTypes.FLOAT)
            .addColumnSet("genres", TableColumnTypes.TEXT)
            .addColumnMap("metadata", TableColumnTypes.TEXT, TableColumnTypes.TEXT)
            .addColumnBoolean("is_checked_out")
            .addColumn("due_date", TableColumnTypes.DATE)
            // Define the primary key for the table.
            // In this case, the table uses a composite primary key.
            .addPartitionBy("title")
            .addPartitionBy("rating");

    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

Instead of using the default type Row.class, you can define your own working object, which will be serialized as a Row.

This working object can be annotated when the field names do not exactly match the column names or when you want to fully describe your table to enable its creation solely from the entity definition.

The following example defines a Book class and then uses it to create the table.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import java.util.Date;
import java.util.Map;
import java.util.Set;
import lombok.Data;

public class Example {
  @EntityTable("example_table")
  @Data
  public class Book {
    @PartitionBy(0)
    @Column(name = "title", type = TableColumnTypes.TEXT)
    private String title;

    @Column(name = "number_of_pages", type = TableColumnTypes.INT)
    private Integer number_of_pages;

    @PartitionBy(1)
    @Column(name = "rating", type = TableColumnTypes.FLOAT)
    private Float rating;

    @Column(name = "genres", type = TableColumnTypes.SET, valueType = TableColumnTypes.TEXT)
    private Set<String> genres;

    @Column(
        name = "metadata",
        type = TableColumnTypes.MAP,
        keyType = TableColumnTypes.TEXT,
        valueType = TableColumnTypes.TEXT)
    private Map<String, String> metadata;

    @Column(name = "is_checked_out", type = TableColumnTypes.BOOLEAN)
    private Boolean is_checked_out;

    @Column(name = "due_date", type = TableColumnTypes.DATE)
    private Date due_date;
  }

  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    Table<Book> table = database.createTable(Book.class);
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "example_table",
    "definition": {
      "columns": {
        "title": {
          "type": "text"
        },
        "number_of_pages": {
          "type": "int"
        },
        "rating": {
          "type": "float"
        },
        "metadata": {
          "type": "map",
          "keyType": "text",
          "valueType": "text"
        },
        "genres": {
          "type": "set",
          "valueType": "text"
        },
        "is_checked_out": {
          "type": "boolean"
        },
        "due_date": {
          "type": "date"
        }
      },
      "primaryKey": {
        "partitionBy": [
          "title", "rating"
        ]
      }
    }
  }
}'

Create a table with a compound primary key

A compound primary key is a primary key consisting of partition (grouping) columns and clustering (sorting) columns. For more information, see Primary keys in tables.

Python
TypeScript
Java
curl

The Python client supports multiple ways to create a table. In all cases, you must define the table schema, and then pass the definition to the create_table method.

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

CreateTableDefinition object
Fluent interface
Dictionary

You can define the table as a CreateTableDefinition and then build the table from the CreateTableDefinition object.

from astrapy import DataAPIClient
from astrapy.constants import SortMode
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableKeyValuedColumnType,
    TableKeyValuedColumnTypeDescriptor,
    TableScalarColumnTypeDescriptor,
    TableValuedColumnTypeDescriptor,
    TableValuedColumnType,
    TablePrimaryKeyDescriptor,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "title": TableScalarColumnTypeDescriptor(column_type=ColumnType.TEXT),
        "number_of_pages": TableScalarColumnTypeDescriptor(column_type=ColumnType.INT),
        "rating": TableScalarColumnTypeDescriptor(column_type=ColumnType.FLOAT),
        "genres": TableValuedColumnTypeDescriptor(
            column_type=TableValuedColumnType.SET,
            value_type=ColumnType.TEXT,
        ),
        "metadata": TableKeyValuedColumnTypeDescriptor(
            column_type=TableKeyValuedColumnType.MAP,
            key_type=ColumnType.TEXT,
            value_type=ColumnType.TEXT,
        ),
        "is_checked_out": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.BOOLEAN
        ),
        "due_date": TableScalarColumnTypeDescriptor(column_type=ColumnType.DATE),
    },
    # Define the primary key for the table.
    # In this case, the table uses a compound primary key.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["title", "rating"],
        partition_sort={
            "number_of_pages": SortMode.ASCENDING,
            "is_checked_out": SortMode.DESCENDING,
        },
    ),
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can use a fluent interface to build the table definition and then create the table from the definition.

from astrapy import DataAPIClient
from astrapy.constants import SortMode
from astrapy.info import CreateTableDefinition, ColumnType

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = (
    CreateTableDefinition.builder()
    # Define all of the columns in the table
    .add_column("title", ColumnType.TEXT)
    .add_column("number_of_pages", ColumnType.INT)
    .add_column("rating", ColumnType.FLOAT)
    .add_set_column(
        "genres",
        ColumnType.TEXT,
    )
    .add_map_column(
        "metadata",
        # This is the key type for the map column
        ColumnType.TEXT,
        # This is the value type for the map column
        ColumnType.TEXT,
    )
    .add_column("is_checked_out", ColumnType.BOOLEAN)
    .add_column("due_date", ColumnType.DATE)
    # Define the primary key for the table.
    # In this case, the table uses a compound primary key.
    .add_partition_by(["title", "rating"])
    .add_partition_sort(
        {
            "number_of_pages": SortMode.ASCENDING,
            "is_checked_out": SortMode.DESCENDING,
        }
    )
    # Finally, build the table definition.
    .build()
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can define the table as a dictionary and then build the table from the dictionary.

from astrapy import DataAPIClient

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        "title": {"type": "text"},
        "number_of_pages": {"type": "int"},
        "rating": {"type": "float"},
        "genres": {"type": "set", "valueType": "text"},
        "metadata": {"type": "map", "keyType": "text", "valueType": "text"},
        "is_checked_out": {"type": "boolean"},
        "due_date": {"type": "date"},
    },
    "primaryKey": {
        "partitionBy": ["title", "rating"],
        "partitionSort": {"number_of_pages": 1, "is_checked_out": -1},
    },
}

table = database.create_table(
    "example_table",
    definition=table_definition,
)

The TypeScript client supports multiple ways to create a table. The method you choose depends on your typing preferences and whether you modified the ser/des configuration.

For more information, see Collection and table typing.

Automatic type inference
Manually typed tables
Untyped tables

The TypeScript client can automatically infer the TypeScript-equivalent type of the table’s schema and primary key.

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a compound primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
    partitionSort: { number_of_pages: 1, is_checked_out: -1 },
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

(async function () {
  // Provide the types and the definition
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

You can manually define the type for your table’s schema and primary key. To create the table, provide the table definition and the types to the createTable method.

This may be necessary if you modify the table’s default ser/des configuration.

import { DataAPIClient, DataAPIDate, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a compound primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
    partitionSort: { number_of_pages: 1, is_checked_out: -1 },
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  title: string;
  number_of_pages?: number | null | undefined;
  rating?: number | null | undefined;
  genres?: Set<string> | undefined;
  metadata?: Map<string, string> | undefined;
  is_checked_out?: boolean | null | undefined;
  due_date?: DataAPIDate | null | undefined;
};

type TablePrimaryKey = Pick<TableSchema, "title" | "rating">;

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can use the TableSchema type as you would any other type. For example, this gives a type error since the TableSchema type from the previous example does not include bad_field:

  const row: TableSchema = {
    title: "Wind with No Name",
    number_of_pages: 193,
    bad_field: "I will error",
  };

To create a table without any typing, pass SomeRow as the single generic type parameter to the createTable method. This types the table’s rows as Record<string, any>.

This is the most flexible but least type-safe option.

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a compound primary key.
  primaryKey: {
    partitionBy: ["title", "rating"],
    partitionSort: { number_of_pages: 1, is_checked_out: -1 },
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
  });
})();

The Java client supports multiple ways to create a table. In all cases, you must define the table schema.

Use a generic type
Define the row type

If you don’t specify the Class parameter when creating an instance of the generic class Table, the client defaults Table as the type. In this case, the working object type T is Row.class.

import static com.datastax.astra.client.core.query.Sort.ascending;
import static com.datastax.astra.client.core.query.Sort.descending;

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnText("title")
            .addColumnInt("number_of_pages")
            .addColumn("rating", TableColumnTypes.FLOAT)
            .addColumnSet("genres", TableColumnTypes.TEXT)
            .addColumnMap("metadata", TableColumnTypes.TEXT, TableColumnTypes.TEXT)
            .addColumnBoolean("is_checked_out")
            .addColumn("due_date", TableColumnTypes.DATE)
            // Define the primary key for the table.
            // In this case, the table uses a compound primary key.
            .addPartitionBy("title")
            .addPartitionBy("rating")
            .addPartitionSort(ascending("number_of_pages"))
            .addPartitionSort(descending("is_checked_out"));

    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

Instead of using the default type Row.class, you can define your own working object, which will be serialized as a Row.

This working object can be annotated when the field names do not exactly match the column names or when you want to fully describe your table to enable its creation solely from the entity definition.

The following example defines a Book class and then uses it to create the table.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.query.SortOrder;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import com.datastax.astra.client.tables.mapping.PartitionSort;
import java.util.Date;
import java.util.Map;
import java.util.Set;
import lombok.Data;

public class Example {
  @EntityTable("example_table")
  @Data
  public class Book {
    @PartitionBy(0)
    @Column(name = "title", type = TableColumnTypes.TEXT)
    private String title;

    @PartitionSort(position = 0, order = SortOrder.ASCENDING)
    @Column(name = "number_of_pages", type = TableColumnTypes.INT)
    private Integer number_of_pages;

    @PartitionBy(1)
    @Column(name = "rating", type = TableColumnTypes.FLOAT)
    private Float rating;

    @Column(name = "genres", type = TableColumnTypes.SET, valueType = TableColumnTypes.TEXT)
    private Set<String> genres;

    @Column(
        name = "metadata",
        type = TableColumnTypes.MAP,
        keyType = TableColumnTypes.TEXT,
        valueType = TableColumnTypes.TEXT)
    private Map<String, String> metadata;

    @PartitionSort(position = 1, order = SortOrder.DESCENDING)
    @Column(name = "is_checked_out", type = TableColumnTypes.BOOLEAN)
    private Boolean is_checked_out;

    @Column(name = "due_date", type = TableColumnTypes.DATE)
    private Date due_date;
  }

  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    Table<Book> table = database.createTable(Book.class);
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "example_table",
    "definition": {
      "columns": {
        "title": {
          "type": "text"
        },
        "number_of_pages": {
          "type": "int"
        },
        "rating": {
          "type": "float"
        },
        "metadata": {
          "type": "map",
          "keyType": "text",
          "valueType": "text"
        },
        "genres": {
          "type": "set",
          "valueType": "text"
        },
        "is_checked_out": {
          "type": "boolean"
        },
        "due_date": {
          "type": "date"
        }
      },
      "primaryKey": {
        "partitionBy": [
          "title",
          "rating"
        ],
        "partitionSort": {
          "number_of_pages": 1,
          "is_checked_out": -1
        }
      }
    }
  }
}'

Create a table with a column to store vector embeddings

If you want to store pre-generated vector embeddings in a table, create a table with a vector column. A table can include more than one vector column.

Python
TypeScript
Java
curl

The Python client supports multiple ways to create a table. In all cases, you must define the table schema, and then pass the definition to the create_table method.

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

CreateTableDefinition object
Fluent interface
Dictionary

You can define the table as a CreateTableDefinition and then build the table from the CreateTableDefinition object.

from astrapy import DataAPIClient
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "example_vector": TableVectorColumnTypeDescriptor(
            dimension=1024,
        ),
        "example_non_vector": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # Define the primary key for the table.
    # In this case, the table uses a single-column primary key.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["example_non_vector"], partition_sort={}
    ),
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can use a fluent interface to build the table definition and then create the table from the definition.

from astrapy import DataAPIClient
from astrapy.info import CreateTableDefinition, ColumnType

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = (
    CreateTableDefinition.builder()
    # Define all of the columns in the table
    .add_vector_column("example_vector", dimension=1024)
    .add_column("example_non_vector", ColumnType.TEXT)
    # Define the primary key for the table.
    # In this case, the table uses a single-column primary key.
    .add_partition_by(["example_non_vector"])
    # Finally, build the table definition.
    .build()
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can define the table as a dictionary and then build the table from the dictionary.

from astrapy import DataAPIClient

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        "example_vector": {"type": "vector", "dimension": 1024},
        "example_non_vector": {"type": "text"},
    },
    "primaryKey": {
        "partitionBy": ["example_non_vector"],
        "partitionSort": {},
    },
}

table = database.create_table(
    "example_table",
    definition=table_definition,
)

The TypeScript client supports multiple ways to create a table. The method you choose depends on your typing preferences and whether you modified the ser/des configuration.

For more information, see Collection and table typing.

Automatic type inference
Manually typed tables
Untyped tables

The TypeScript client can automatically infer the TypeScript-equivalent type of the table’s schema and primary key.

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    example_vector: { type: "vector", dimension: 1024 },
    example_non_vector: "text",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["example_non_vector"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

(async function () {
  // Provide the types and the definition
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can manually define the type for your table’s schema and primary key. To create the table, provide the table definition and the types to the createTable method.

This may be necessary if you modify the table’s default ser/des configuration.

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    example_vector: { type: "vector", dimension: 1024 },
    example_non_vector: "text",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["example_non_vector"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  example_vector: DataAPIVector;
  example_non_vector: string;
};

type TablePrimaryKey = Pick<TableSchema, "example_non_vector">;

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

To create a table without any typing, pass SomeRow as the single generic type parameter to the createTable method. This types the table’s rows as Record<string, any>.

This is the most flexible but least type-safe option.

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    example_vector: { type: "vector", dimension: 1024 },
    example_non_vector: "text",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["example_non_vector"],
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
  });
})();

The Java client supports multiple ways to create a table. In all cases, you must define the table schema.

Use a generic type
Define the row type

If you don’t specify the Class parameter when creating an instance of the generic class Table, the client defaults Table as the type. In this case, the working object type T is Row.class.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnVector(
                "example_vector",
                new TableColumnDefinitionVector().dimension(1024).metric(SimilarityMetric.COSINE))
            .addColumnText("example_non_vector")
            // Define the primary key for the table.
            // In this case, the table uses a single-column primary key.
            .addPartitionBy("example_non_vector");

    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

Instead of using the default type Row.class, you can define your own working object, which will be serialized as a Row.

This working object can be annotated when the field names do not exactly match the column names or when you want to fully describe your table to enable its creation solely from the entity definition.

The following example defines a Book class and then uses it to create the table.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
  @EntityTable("example_table")
  @Data
  public class Book {
    @ColumnVector(name = "example_vector", dimension = 1024, metric = SimilarityMetric.COSINE)
    private DataAPIVector vector;

    @PartitionBy(0)
    @Column(name = "example_non_vector", type = TableColumnTypes.TEXT)
    private String exampleNonVector;
  }

  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    Table<Book> table = database.createTable(Book.class);
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYS PACE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "example_table",
    "definition": {
      "columns": {
        "example_vector": {
          "type": "vector",
          "dimension": 1024
        },
        "example_non_vector": {
          "type": "text"
        }
      },
      "primaryKey": "example_non_vector"
    }
  }
}'

Create a table with a column to automatically generate vector embeddings

If you want to automatically generate vector embeddings, create a table with a vector column and configure an embedding provider integration for the column.

The configuration depends on the embedding provider.

You can also configure an embedding provider integration after table creation. For more information, see Alter a table.

If you want to store the original text in addition to the vector embeddings that were generated from the text, then you need to create a separate column to store the text.

You can configure a different embedding provider for each vector column in the table. If you want to use the same embedding provider for all vector columns in the table, you must still configure the embedding provider for each vector column.

Python
TypeScript
Java
curl

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

Azure OpenAI

For more detailed instructions, see Integrate Azure OpenAI as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Azure OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="azureOpenAI",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
                parameters={
                    "resourceName": "RESOURCE_NAME",
                    "deploymentId": "DEPLOYMENT_ID",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Azure OpenAI integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="azureOpenAI",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
            parameters={
                "resourceName": "RESOURCE_NAME",
                "deploymentId": "DEPLOYMENT_ID",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Azure OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "azureOpenAI",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
            "parameters": {
                "resourceName": "RESOURCE_NAME",
                "deploymentId": "DEPLOYMENT_ID",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Azure OpenAI API key that you want to use. Must be the name of an existing Azure OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.

For Azure OpenAI, you must select the model that matches the one deployed to your DEPLOYMENT_ID in Azure.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
RESOURCE_NAME: The name of your Azure OpenAI Service resource, as defined in the resource’s Instance details. For more information, see the Azure OpenAI documentation.
DEPLOYMENT_ID: Your Azure OpenAI resource’s Deployment name. For more information, see the Azure OpenAI documentation.

Hugging Face - Dedicated

For more detailed instructions, see Integrate Hugging Face Dedicated as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Hugging Face Dedicated integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="huggingfaceDedicated",
                model_name="endpoint-defined-model",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
                parameters={
                    "endpointName": "ENDPOINT_NAME",
                    "regionName": "REGION_NAME",
                    "cloudName": "CLOUD_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Hugging Face Dedicated integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="huggingfaceDedicated",
            model_name="endpoint-defined-model",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
            parameters={
                "endpointName": "ENDPOINT_NAME",
                "regionName": "REGION_NAME",
                "cloudName": "CLOUD_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Hugging Face Dedicated integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "huggingfaceDedicated",
            "model_name": "endpoint-defined-model",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
            "parameters": {
                "endpointName": "ENDPOINT_NAME",
                "regionName": "REGION_NAME",
                "cloudName": "CLOUD_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Dedicated user access token that you want to use. Must be the name of an existing Hugging Face Dedicated user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: endpoint-defined-model.

For Hugging Face Dedicated, you must deploy the model as a text embeddings inference (TEI) container.

You must set MODEL_NAME to endpoint-defined-model because this integration uses the model specified in your dedicated endpoint configuration.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ENDPOINT_NAME: The programmatically-generated name of your Hugging Face Dedicated endpoint. This is the first part of the endpoint URL. For example, if your endpoint URL is https://mtp1x7muf6qyn3yh.us-east-2.aws.endpoints.huggingface.cloud, the endpoint name is mtp1x7muf6qyn3yh.
REGION_NAME: The cloud provider region your Hugging Face Dedicated endpoint is deployed to. For example, us-east-2.
CLOUD_NAME: The cloud provider your Hugging Face Dedicated endpoint is deployed to. For example, aws.

Hugging Face - Serverless

For more detailed instructions, see Integrate Hugging Face Serverless as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Hugging Face Serverless integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="huggingface",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Hugging Face Serverless integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="huggingface",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Hugging Face Serverless integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "huggingface",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Serverless user access token that you want to use. Must be the name of an existing Hugging Face Serverless user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: sentence-transformers/all-MiniLM-L6-v2, intfloat/multilingual-e5-large, intfloat/multilingual-e5-large-instruct, BAAI/bge-small-en-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Jina AI

For more detailed instructions, see Integrate Jina AI as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Jina AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="jinaAI",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Jina AI integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="jinaAI",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Jina AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "jinaAI",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Jina AI API key that you want to use. Must be the name of an existing Jina AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: jina-embeddings-v2-base-en, jina-embeddings-v2-base-de, jina-embeddings-v2-base-es, jina-embeddings-v2-base-code, jina-embeddings-v2-base-zh.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Mistral AI

For more detailed instructions, see Integrate Mistral AI as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Mistral AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="mistral",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Mistral AI integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="mistral",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Mistral AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "mistral",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Mistral AI API key that you want to use. Must be the name of an existing Mistral AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: mistral-embed.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

NVIDIA

For more detailed instructions, see Integrate NVIDIA as an embedding provider. Your database must be in a supported region.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The NVIDIA integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            service=VectorServiceOptions(
                provider="nvidia",
                model_name="nvidia/nv-embedqa-e5-v5",
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The NVIDIA integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        service=VectorServiceOptions(
            provider="nvidia",
            model_name="nvidia/nv-embedqa-e5-v5",
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The NVIDIA integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "service": {
            "provider": "nvidia",
            "model_name": "nvidia/nv-embedqa-e5-v5",
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

OpenAI

For more detailed instructions, see Integrate OpenAI as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="openai",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
                parameters={
                    "organizationId": "ORGANIZATION_ID",
                    "projectId": "PROJECT_ID",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The OpenAI integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="openai",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
            parameters={
                "organizationId": "ORGANIZATION_ID",
                "projectId": "PROJECT_ID",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "openai",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
            "parameters": {
                "organizationId": "ORGANIZATION_ID",
                "projectId": "PROJECT_ID",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the OpenAI API key that you want to use. Must be the name of an existing OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ORGANIZATION_ID: Optional. The ID of the OpenAI organization that owns the API key. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about organization IDs, see the OpenAI API reference.
PROJECT_ID: Optional. The ID of the OpenAI project that owns the API key. This cannot use the default project. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about project IDs, see the OpenAI API reference.

Upstage

For more detailed instructions, see Integrate Upstage as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Upstage integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="upstageAI",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Upstage integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="upstageAI",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Upstage integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "upstageAI",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Upstage API key that you want to use. Must be the name of an existing Upstage API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: solar-embedding-1-large.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Voyage AI

For more detailed instructions, see Integrate Voyage AI as an embedding provider.

TableDefinition object
Fluent interface
Dictionary

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableVectorColumnTypeDescriptor,
    VectorServiceOptions
)

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = CreateTableDefinition(
    columns={
        # This column will store vector embeddings.
        # The Voyage AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": TableVectorColumnTypeDescriptor(
            dimension=MODEL_DIMENSIONS,
            service=VectorServiceOptions(
                provider="voyageAI",
                model_name="MODEL_NAME",
                authentication={
                    "providerKey": "API_KEY_NAME",
                },
            ),
        ),
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.TEXT
        ),
    },
    # You should change the primary key definition to meet the needs of your data.
    primary_key=TablePrimaryKeyDescriptor(
        partition_by=["TEXT_COLUMN_NAME"],
        partition_sort={}
    ),
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient
from astrapy.constants import VectorMetric
from astrapy.info import CreateTableDefinition, ColumnType, VectorServiceOptions

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = (
    CreateTableDefinition.builder()
    # This column will store vector embeddings.
    # The Voyage AI integration
    # will automatically generate vector embeddings
    # for any text inserted to this column.
    .add_vector_column("VECTOR_COLUMN_NAME",
        dimension=MODEL_DIMENSIONS,
        service=VectorServiceOptions(
            provider="voyageAI",
            model_name="MODEL_NAME",
            authentication={
                "providerKey": "API_KEY_NAME",
            },
        ),
    )
    # If you want to store the original text
    # in addition to the generated embeddings
    # you must create a separate column.
    .add_column("TEXT_COLUMN_NAME", ColumnType.TEXT)
    # You should change the primary key definition to meet the needs of your data.
    .add_partition_by(["TEXT_COLUMN_NAME"])
    # Finally, build the table definition.
    .build()
)

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

import os
from astrapy import DataAPIClient

# Instantiate the client
client = DataAPIClient()

# Connect to a database
database = client.get_database(
    os.environ["API_ENDPOINT"],
    token=os.environ["APPLICATION_TOKEN"]
)

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        # This column will store vector embeddings.
        # The Voyage AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "voyageAI",
            "model_name": "MODEL_NAME",
            "authentication": {
                "providerKey": "API_KEY_NAME",
            },
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": {"type": "text"},
    },
    # You should change the primary key definition to meet the needs of your data.
    "primaryKey": {
        "partitionBy": ["TEXT_COLUMN_NAME"],
        "partitionSort": {},
    },
}

# Create the table
table = database.create_table(
    "TABLE_NAME",
    definition=table_definition,
)

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Voyage AI API key that you want to use. Must be the name of an existing Voyage AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embedding_api_key parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: voyage-2, voyage-code-2, voyage-finance-2, voyage-large-2, voyage-large-2-instruct, voyage-law-2, voyage-multilingual-2.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Azure OpenAI

For more detailed instructions, see Integrate Azure OpenAI as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Azure OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'azureOpenAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          resourceName: 'RESOURCE_NAME',
          deploymentId: 'DEPLOYMENT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Azure OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'azureOpenAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          resourceName: 'RESOURCE_NAME',
          deploymentId: 'DEPLOYMENT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Azure OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'azureOpenAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          resourceName: 'RESOURCE_NAME',
          deploymentId: 'DEPLOYMENT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Azure OpenAI API key that you want to use. Must be the name of an existing Azure OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.

For Azure OpenAI, you must select the model that matches the one deployed to your DEPLOYMENT_ID in Azure.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
RESOURCE_NAME: The name of your Azure OpenAI Service resource, as defined in the resource’s Instance details. For more information, see the Azure OpenAI documentation.
DEPLOYMENT_ID: Your Azure OpenAI resource’s Deployment name. For more information, see the Azure OpenAI documentation.

Hugging Face - Dedicated

For more detailed instructions, see Integrate Hugging Face Dedicated as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Dedicated integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingfaceDedicated',
        modelName: 'endpoint-defined-model',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          endpointName: 'ENDPOINT_NAME',
          regionName: 'REGION_NAME',
          cloudName: 'CLOUD_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Dedicated integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingfaceDedicated',
        modelName: 'endpoint-defined-model',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          endpointName: 'ENDPOINT_NAME',
          regionName: 'REGION_NAME',
          cloudName: 'CLOUD_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Dedicated integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingfaceDedicated',
        modelName: 'endpoint-defined-model',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          endpointName: 'ENDPOINT_NAME',
          regionName: 'REGION_NAME',
          cloudName: 'CLOUD_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Dedicated user access token that you want to use. Must be the name of an existing Hugging Face Dedicated user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: endpoint-defined-model.

For Hugging Face Dedicated, you must deploy the model as a text embeddings inference (TEI) container.

You must set MODEL_NAME to endpoint-defined-model because this integration uses the model specified in your dedicated endpoint configuration.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ENDPOINT_NAME: The programmatically-generated name of your Hugging Face Dedicated endpoint. This is the first part of the endpoint URL. For example, if your endpoint URL is https://mtp1x7muf6qyn3yh.us-east-2.aws.endpoints.huggingface.cloud, the endpoint name is mtp1x7muf6qyn3yh.
REGION_NAME: The cloud provider region your Hugging Face Dedicated endpoint is deployed to. For example, us-east-2.
CLOUD_NAME: The cloud provider your Hugging Face Dedicated endpoint is deployed to. For example, aws.

Hugging Face - Serverless

For more detailed instructions, see Integrate Hugging Face Serverless as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Serverless integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingface',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Serverless integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingface',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Hugging Face Serverless integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'huggingface',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Serverless user access token that you want to use. Must be the name of an existing Hugging Face Serverless user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: sentence-transformers/all-MiniLM-L6-v2, intfloat/multilingual-e5-large, intfloat/multilingual-e5-large-instruct, BAAI/bge-small-en-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Jina AI

For more detailed instructions, see Integrate Jina AI as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Jina AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'jinaAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Jina AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'jinaAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Jina AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'jinaAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Jina AI API key that you want to use. Must be the name of an existing Jina AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: jina-embeddings-v2-base-en, jina-embeddings-v2-base-de, jina-embeddings-v2-base-es, jina-embeddings-v2-base-code, jina-embeddings-v2-base-zh.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Mistral AI

For more detailed instructions, see Integrate Mistral AI as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Mistral AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'mistral',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Mistral AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'mistral',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Mistral AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'mistral',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Mistral AI API key that you want to use. Must be the name of an existing Mistral AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: mistral-embed.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

NVIDIA

For more detailed instructions, see Integrate NVIDIA as an embedding provider. Your database must be in a supported region.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The NVIDIA integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      service: {
        provider: 'nvidia',
        modelName: 'nvidia/nv-embedqa-e5-v5',
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The NVIDIA integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      service: {
        provider: 'nvidia',
        modelName: 'nvidia/nv-embedqa-e5-v5',
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The NVIDIA integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      service: {
        provider: 'nvidia',
        modelName: 'nvidia/nv-embedqa-e5-v5',
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

OpenAI

For more detailed instructions, see Integrate OpenAI as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'openai',
        modelName: 'MODEL_NAME}',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          organizationId: 'ORGANIZATION_ID',
          projectId: 'PROJECT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'openai',
        modelName: 'MODEL_NAME}',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          organizationId: 'ORGANIZATION_ID',
          projectId: 'PROJECT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The OpenAI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'openai',
        modelName: 'MODEL_NAME}',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
        parameters: {
          organizationId: 'ORGANIZATION_ID',
          projectId: 'PROJECT_ID',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the OpenAI API key that you want to use. Must be the name of an existing OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ORGANIZATION_ID: Optional. The ID of the OpenAI organization that owns the API key. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about organization IDs, see the OpenAI API reference.
PROJECT_ID: Optional. The ID of the OpenAI project that owns the API key. This cannot use the default project. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about project IDs, see the OpenAI API reference.

Upstage

For more detailed instructions, see Integrate Upstage as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Upstage integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'upstageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Upstage integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'upstageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Upstage integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'upstageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Upstage API key that you want to use. Must be the name of an existing Upstage API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: solar-embedding-1-large.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Voyage AI

For more detailed instructions, see Integrate Voyage AI as an embedding provider.

Automatic type inference
Manually typed tables
Untyped tables

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Voyage AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'voyageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

import { DataAPIClient, DataAPIVector, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Voyage AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'voyageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

// Manually define the type of the table's schema and primary key
type TableSchema = {
  VECTOR_COLUMN_NAME: DataAPIVector,
  TEXT_COLUMN_NAME: string;
};

type TablePrimaryKey = Pick<TableSchema, "TEXT_COLUMN_NAME">;

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Instantiate the client
const client = new DataAPIClient();

// Connect to a database
const database = client.db(process.env.API_ENDPOINT, {
  token: process.env.APPLICATION_TOKEN,
});

// Define the columns and primary key for the table
const tableDefinition = Table.schema({
  columns: {
    // This column will store vector embeddings.
    // The Voyage AI integration
    // will automatically generate vector embeddings
    // for any text inserted to this column.
    VECTOR_COLUMN_NAME: {
      type: "vector",
      dimension: MODEL_DIMENSIONS,
      service: {
        provider: 'voyageAI',
        modelName: 'MODEL_NAME',
        authentication: {
          providerKey: 'API_KEY_NAME',
        },
      },
    },
    // If you want to store the original text
    // in addition to the generated embeddings
    // you must create a separate column.
    TEXT_COLUMN_NAME: "text",
  },
  // You should change the primary key definition to meet the needs of your data.
  primaryKey: {
    partitionBy: ["TEXT_COLUMN_NAME"],
  },
});

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Voyage AI API key that you want to use. Must be the name of an existing Voyage AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingApiKey parameter when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: voyage-2, voyage-code-2, voyage-finance-2, voyage-large-2, voyage-large-2-instruct, voyage-law-2, voyage-multilingual-2.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Azure OpenAI

For more detailed instructions, see Integrate Azure OpenAI as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));
    // Define parameters for the service provider
    Map<String, Object > params = new HashMap<>();
    params.put("resourceName", "RESOURCE_NAME");
    params.put("deploymentId", "DEPLOYMENT_ID");


    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Azure OpenAI integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("azureOpenAI")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                            .parameters(params)
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Azure OpenAI integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "azureOpenAI",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"),
            parameters = {
                @KeyValue(key = "resourceName", value = "RESOURCE_NAME"),
                @KeyValue(key = "deploymentId", value = "DEPLOYMENT_ID")
            })
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Azure OpenAI API key that you want to use. Must be the name of an existing Azure OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.

For Azure OpenAI, you must select the model that matches the one deployed to your DEPLOYMENT_ID in Azure.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
RESOURCE_NAME: The name of your Azure OpenAI Service resource, as defined in the resource’s Instance details. For more information, see the Azure OpenAI documentation.
DEPLOYMENT_ID: Your Azure OpenAI resource’s Deployment name. For more information, see the Azure OpenAI documentation.

Hugging Face - Dedicated

For more detailed instructions, see Integrate Hugging Face Dedicated as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define parameters for the service provider
    Map<String, Object > params = new HashMap<>();
    params.put("endpointName", "ENDPOINT_NAME");
    params.put("regionName", "REGION_NAME");
    params.put("cloudName", "CLOUD_NAME");

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Hugging Face Dedicated integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("huggingfaceDedicated")
                            .modelName("endpoint-defined-model")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Hugging Face Dedicated integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "huggingfaceDedicated",
            modelName = "endpoint-defined-model",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"),
            parameters = {
                @KeyValue(key = "endpointName", value = "ENDPOINT_NAME"),
                @KeyValue(key = "regionName", value = "REGION_NAME"),
                @KeyValue(key = "cloudName", value = "CLOUD_NAME")
            })
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Dedicated user access token that you want to use. Must be the name of an existing Hugging Face Dedicated user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: endpoint-defined-model.

For Hugging Face Dedicated, you must deploy the model as a text embeddings inference (TEI) container.

You must set MODEL_NAME to endpoint-defined-model because this integration uses the model specified in your dedicated endpoint configuration.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ENDPOINT_NAME: The programmatically-generated name of your Hugging Face Dedicated endpoint. This is the first part of the endpoint URL. For example, if your endpoint URL is https://mtp1x7muf6qyn3yh.us-east-2.aws.endpoints.huggingface.cloud, the endpoint name is mtp1x7muf6qyn3yh.
REGION_NAME: The cloud provider region your Hugging Face Dedicated endpoint is deployed to. For example, us-east-2.
CLOUD_NAME: The cloud provider your Hugging Face Dedicated endpoint is deployed to. For example, aws.

Hugging Face - Serverless

For more detailed instructions, see Integrate Hugging Face Serverless as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Hugging Face Serverless integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("huggingface")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Hugging Face Serverless integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "huggingface",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"))
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Serverless user access token that you want to use. Must be the name of an existing Hugging Face Serverless user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: sentence-transformers/all-MiniLM-L6-v2, intfloat/multilingual-e5-large, intfloat/multilingual-e5-large-instruct, BAAI/bge-small-en-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Jina AI

For more detailed instructions, see Integrate Jina AI as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Jina AI integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("jinaAI")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Jina AI integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "jinaAI",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"))
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Jina AI API key that you want to use. Must be the name of an existing Jina AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: jina-embeddings-v2-base-en, jina-embeddings-v2-base-de, jina-embeddings-v2-base-es, jina-embeddings-v2-base-code, jina-embeddings-v2-base-zh.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Mistral AI

For more detailed instructions, see Integrate Mistral AI as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Mistral AI integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("mistral")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Mistral AI integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "mistral",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"))
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Mistral AI API key that you want to use. Must be the name of an existing Mistral AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: mistral-embed.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

NVIDIA

For more detailed instructions, see Integrate NVIDIA as an embedding provider. Your database must be in a supported region.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));
    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The NVIDIA integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .metric(SimilarityMetric.COSINE)
                    .service(
                        new VectorServiceOptions()
                            .provider("nvidia")
                            .modelName("nvidia/nv-embedqa-e5-v5")
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The NVIDIA integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            provider = "nvidia",
            modelName = "nvidia/nv-embedqa-e5-v5")
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

OpenAI

For more detailed instructions, see Integrate OpenAI as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define parameters for the service provider
    Map<String, Object > params = new HashMap<>();
    params.put("organizationId", "ORGANIZATION_ID");
    params.put("projectId", "PROJECT_ID");

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The OpenAI integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("openai")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                            .parameters(params)
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The OpenAI integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "openai",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"),
            parameters = {
                @KeyValue(key = "organizationId", value = "ORGANIZATION_ID"),
                @KeyValue(key = "projectId", value = "PROJECT_ID")
            })
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the OpenAI API key that you want to use. Must be the name of an existing OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ORGANIZATION_ID: Optional. The ID of the OpenAI organization that owns the API key. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about organization IDs, see the OpenAI API reference.
PROJECT_ID: Optional. The ID of the OpenAI project that owns the API key. This cannot use the default project. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about project IDs, see the OpenAI API reference.

Upstage

For more detailed instructions, see Integrate Upstage as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Upstage integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("upstageAI")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Upstage integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "upstageAI",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"))
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Upstage API key that you want to use. Must be the name of an existing Upstage API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: solar-embedding-1-large.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Voyage AI

For more detailed instructions, see Integrate Voyage AI as an embedding provider.

Use a generic type
Define the row type

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.core.vectorize.VectorServiceOptions;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.ColumnDefinitionVector;
import com.datastax.astra.client.tables.definition.indexes.TableVectorIndexDefinition;
import com.datastax.astra.client.tables.definition.rows.Row;

import java.util.HashMap;
import java.util.Map;

public class Example {

  public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Define the columns and primary key for the table
    TableDefinition tableDefinition =
        new TableDefinition()
            // This column will store vector embeddings.
            // The Voyage AI integration
            // will automatically generate vector embeddings
            // for any text inserted to this column.
            .addColumnVector(
                "VECTOR_COLUMN_NAME",
                new ColumnDefinitionVector()
                    .dimension(MODEL_DIMENSIONS)
                    .metric(SimilarityMetric.SIMILARITY_METRIC)
                    .service(
                        new VectorServiceOptions()
                            .provider("voyageAI")
                            .modelName("MODEL_NAME")
                            .authentication(Map.of("providerKey", "API_KEY_NAME"))
                    )
            )
            // If you want to store the original text
            // in addition to the generated embeddings
            // you must create a separate column.
            .addColumnText("TEXT_COLUMN_NAME")
            // You should change the primary key definition to meet the needs of your data.
            .addPartitionBy("TEXT_COLUMN_NAME");

    // Create the table
    Table<Row> table = database.createTable("TABLE_NAME", tableDefinition);
  }
}

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.core.vector.SimilarityMetric;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.ColumnTypes;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.ColumnVector;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.KeyValue;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import lombok.Data;

public class Example {
    @EntityTable("TABLE_NAME")
    @Data
    public class ExampleRow {
        // This column will store vector embeddings.
        // The Voyage AI integration
        // will automatically generate vector embeddings
        // for any text inserted to this column.
        @ColumnVector(
            name = "VECTOR_COLUMN_NAME",
            dimension = MODEL_DIMENSIONS,
            metric = SimilarityMetric.SIMILARITY_METRIC,
            provider = "voyageAI",
            modelName = "MODEL_NAME",
            authentication = @KeyValue(key = "providerKey", value = "API_KEY_NAME"))
        private DataAPIVector exampleVector;

        // If you want to store the original text
        // in addition to the generated embeddings
        // you must create a separate column.
        // You should change the primary key definition (`PartitionBy`) to meet the needs of your data.
        @PartitionBy(0)
        @Column(name = "TEXT_COLUMN_NAME", type = ColumnTypes.TEXT)
        private String originalText;
    }
    public static void main(String[] args) {
    // Instantiate the client
    DataAPIClient client = new DataAPIClient(new DataAPIClientOptions());

    // Connect to a database
    Database database =
        client.getDatabase(
            System.getenv("API_ENDPOINT"),
            new DatabaseOptions(System.getenv("APPLICATION_TOKEN"), new DataAPIClientOptions()));

    // Create the table
    Table<ExampleRow> table = database.createTable(ExampleRow.class);
  }
}

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Voyage AI API key that you want to use. Must be the name of an existing Voyage AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in the embeddingAuthProvider() method of CreateTableOptions when you instantiate a Table object with the commands to create a table or get a table. The client will send the x-embedding-api-key header with the specified key to any underlying HTTP request that requires vectorize authentication. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search. You can use this authentication method only if all affected columns use the same embedding provider.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: voyage-2, voyage-code-2, voyage-finance-2, voyage-large-2, voyage-large-2-instruct, voyage-law-2, voyage-multilingual-2.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Azure OpenAI

For more detailed instructions, see Integrate Azure OpenAI as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Azure OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "azureOpenAI",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            },
            "parameters": {
              "resourceName": "RESOURCE_NAME",
              "deploymentId": "DEPLOYMENT_ID"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Azure OpenAI API key that you want to use. Must be the name of an existing Azure OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.

For Azure OpenAI, you must select the model that matches the one deployed to your DEPLOYMENT_ID in Azure.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
RESOURCE_NAME: The name of your Azure OpenAI Service resource, as defined in the resource’s Instance details. For more information, see the Azure OpenAI documentation.
DEPLOYMENT_ID: Your Azure OpenAI resource’s Deployment name. For more information, see the Azure OpenAI documentation.

Hugging Face - Dedicated

For more detailed instructions, see Integrate Hugging Face Dedicated as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Hugging Face Dedicated integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "huggingfaceDedicated",
            "modelName": "endpoint-defined-model",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            },
            "parameters": {
              "endpointName": "ENDPOINT_NAME",
              "regionName": "REGION_NAME",
              "cloudName": "CLOUD_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Dedicated user access token that you want to use. Must be the name of an existing Hugging Face Dedicated user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: endpoint-defined-model.

For Hugging Face Dedicated, you must deploy the model as a text embeddings inference (TEI) container.

You must set MODEL_NAME to endpoint-defined-model because this integration uses the model specified in your dedicated endpoint configuration.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ENDPOINT_NAME: The programmatically-generated name of your Hugging Face Dedicated endpoint. This is the first part of the endpoint URL. For example, if your endpoint URL is https://mtp1x7muf6qyn3yh.us-east-2.aws.endpoints.huggingface.cloud, the endpoint name is mtp1x7muf6qyn3yh.
REGION_NAME: The cloud provider region your Hugging Face Dedicated endpoint is deployed to. For example, us-east-2.
CLOUD_NAME: The cloud provider your Hugging Face Dedicated endpoint is deployed to. For example, aws.

Hugging Face - Serverless

For more detailed instructions, see Integrate Hugging Face Serverless as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Hugging Face Serverless integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "huggingface",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Hugging Face Serverless user access token that you want to use. Must be the name of an existing Hugging Face Serverless user access token in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: sentence-transformers/all-MiniLM-L6-v2, intfloat/multilingual-e5-large, intfloat/multilingual-e5-large-instruct, BAAI/bge-small-en-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Jina AI

For more detailed instructions, see Integrate Jina AI as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Jina AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "jinaAI",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Jina AI API key that you want to use. Must be the name of an existing Jina AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: jina-embeddings-v2-base-en, jina-embeddings-v2-base-de, jina-embeddings-v2-base-es, jina-embeddings-v2-base-code, jina-embeddings-v2-base-zh.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Mistral AI

For more detailed instructions, see Integrate Mistral AI as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Mistral AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "mistral",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Mistral AI API key that you want to use. Must be the name of an existing Mistral AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: mistral-embed.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

NVIDIA

For more detailed instructions, see Integrate NVIDIA as an embedding provider. Your database must be in a supported region.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The NVIDIA integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "service": {
            "provider": "nvidia",
            "modelName": "nvidia/nv-embedqa-e5-v5"
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

OpenAI

For more detailed instructions, see Integrate OpenAI as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The OpenAI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "openai",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            },
            "parameters": {
              "organizationId": "ORGANIZATION_ID",
              "projectId": "PROJECT_ID"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the OpenAI API key that you want to use. Must be the name of an existing OpenAI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.
ORGANIZATION_ID: Optional. The ID of the OpenAI organization that owns the API key. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about organization IDs, see the OpenAI API reference.
PROJECT_ID: Optional. The ID of the OpenAI project that owns the API key. This cannot use the default project. Only required if your OpenAI account belongs to multiple organizations or if you are using a legacy user API key to access projects. For more information about project IDs, see the OpenAI API reference.

Upstage

For more detailed instructions, see Integrate Upstage as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Upstage integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "upstageAI",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Upstage API key that you want to use. Must be the name of an existing Upstage API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: solar-embedding-1-large.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Voyage AI

For more detailed instructions, see Integrate Voyage AI as an embedding provider.

curl -sS -L -X POST "$API_ENDPOINT/api/json/v1/default_keyspace" \
  --header "Token: $APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "TABLE_NAME",
    "definition": {
      "columns": {
        # This column will store vector embeddings.
        # The Voyage AI integration
        # will automatically generate vector embeddings
        # for any text inserted to this column.
        "VECTOR_COLUMN_NAME": {
          "type": "vector",
          "dimension": MODEL_DIMENSIONS,
          "service": {
            "provider": "voyageAI",
            "modelName": "MODEL_NAME",
            "authentication": {
              "providerKey": "API_KEY_NAME"
            }
          }
        },
        # If you want to store the original text
        # in addition to the generated embeddings
        # you must create a separate column.
        "TEXT_COLUMN_NAME": "text"
      },
      # You should change the primary key definition to meet the needs of your data.
      "primaryKey": "TEXT_COLUMN_NAME"
    }
  }
}'

Replace the following:

TABLE_NAME: The name for your table.
VECTOR_COLUMN_NAME: The name for your vector column.
TEXT_COLUMN_NAME: The name for the text column that will store the original text. Omit this column if you won’t store the original text in addition to the generated embeddings.
API_KEY_NAME: The name of the Voyage AI API key that you want to use. Must be the name of an existing Voyage AI API key in the Astra Portal. For more information, see Embedding provider authentication.

Alternatively, you can omit this parameter and instead provide the authentication key in an x-embedding-api-key header. Header authentication overrides the API_KEY_NAME parameter if you set both. If you use the header instead of specifying the API_KEY_NAME parameter, you must include the header in every command that uses vectorize, including writes and vector search.
MODEL_NAME: The model that you want to use to generate embeddings. The available models are: voyage-2, voyage-code-2, voyage-finance-2, voyage-large-2, voyage-large-2-instruct, voyage-law-2, voyage-multilingual-2.
MODEL_DIMENSIONS: The number of dimensions that you want the generated vectors to have. Your chosen embedding model must support the specified number of dimensions.

If you omit dimension, Astra can use a default dimension value. However, some models don’t have default dimensions. You can use the Data API to find supported embedding providers and their configuration parameters, including dimensions ranges and default dimensions.

Create a table that uses a user-defined type (UDT)

In addition to the supported types, you can create a user-defined type to use in your table.

You can use a user-defined type as the type of a column or as the value type of a map, list, or set column. You can’t use a user-defined type as the key type of a map column or as a partitionKey or clustering key.

The following examples demonstrate how to use a user-defined type called person for the group_leader column, value type in the group_members set column, and value type in the group_roles map column.

Python
TypeScript
Java
curl

The Python client supports multiple ways to create a table. In all cases, you must define the table schema, and then pass the definition to the create_table method.

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

CreateTableDefinition object
Fluent interface
Dictionary

You can define the table as a CreateTableDefinition and then build the table from the CreateTableDefinition object.

from astrapy import DataAPIClient
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableScalarColumnTypeDescriptor,
    TablePrimaryKeyDescriptor,
    TableUDTColumnDescriptor,
    TableValuedColumnTypeDescriptor,
    TableValuedColumnType,
    TableKeyValuedColumnTypeDescriptor,
    TableKeyValuedColumnType,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "id": TableScalarColumnTypeDescriptor(column_type=ColumnType.UUID),
        "group_leader": TableUDTColumnDescriptor(udt_name="person"),
        "group_members": TableValuedColumnTypeDescriptor(
            column_type=TableValuedColumnType.SET,
            value_type=TableUDTColumnDescriptor(
                udt_name="person",
            ),
        ),
        "group_roles": TableKeyValuedColumnTypeDescriptor(
            column_type=TableKeyValuedColumnType.MAP,
            key_type=ColumnType.TEXT,
            value_type=TableUDTColumnDescriptor(
                udt_name="person",
            ),
        ),
    },
    primary_key=TablePrimaryKeyDescriptor(partition_by=["id"], partition_sort={}),
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can use a fluent interface to build the table definition and then create the table from the definition.

from astrapy import DataAPIClient
from astrapy.info import CreateTableDefinition, ColumnType, TableUDTColumnDescriptor

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = (
    CreateTableDefinition.builder()
    # Define all of the columns in the table
    .add_scalar_column("id", ColumnType.UUID)
    .add_userdefinedtype_column("group_leader", udt_name="person")
    .add_set_column(
        "group_members",
        value_type=TableUDTColumnDescriptor(
            udt_name="person",
        ),
    )
    .add_map_column(
        "group_roles",
        key_type=ColumnType.TEXT,
        value_type=TableUDTColumnDescriptor(
            udt_name="person",
        ),
    )
    # Define the primary key for the table.
    .add_partition_by(["id"])
    # Finally, build the table definition.
    .build()
)

table = database.create_table(
    "example_table",
    definition=table_definition,
)

You can define the table as a dictionary and then build the table from the dictionary.

from astrapy import DataAPIClient

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

# Define the columns and primary key for the table
table_definition = {
    "columns": {
        "id": {"type": "uuid"},
        "group_leader": {
            "type": "userDefined",
            "udtName": "person",
        },
        "group_members": {
            "type": "set",
            "valueType": {
                "type": "userDefined",
                "udtName": "person",
            },
        },
        "group_roles": {
            "type": "map",
            "keyType": "text",
            "valueType": {
                "type": "userDefined",
                "udtName": "person",
            },
        },
    },
    "primaryKey": {
        "partitionBy": ["id"],
        "partitionSort": {},
    },
}

table = database.create_table(
    "example_table",
    definition=table_definition,
)

The TypeScript client supports multiple ways to create a table. The method you choose depends on your typing preferences and whether you modified the ser/des configuration.

For more information, see Collection and table typing.

Automatic type inference
Manually typed tables
Untyped tables

The TypeScript client can automatically infer the TypeScript-equivalent type of the table’s schema and primary key.

import {
  DataAPIClient,
  InferTablePrimaryKey,
  InferTableSchema,
  Table,
} from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    id: "uuid",
    group_leader: {
      type: "userDefined",
      udtName: "person",
    },
    group_members: {
      type: "set",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
    group_roles: {
      type: "map",
      keyType: "text",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
  },
  // Define the primary key for the table.
  primaryKey: {
    partitionBy: ["id"],
  },
});

// Infer the TypeScript-equivalent type of the table's schema and primary key
type TableSchema = InferTableSchema<typeof tableDefinition>;
type TablePrimaryKey = InferTablePrimaryKey<typeof tableDefinition>;

(async function () {
  // Provide the types and the definition
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

You can manually define the type for your table’s schema and primary key. To create the table, provide the table definition and the types to the createTable method.

This may be necessary if you modify the table’s default ser/des configuration.

import { DataAPIClient, DataAPIDate, Table, UUID } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    id: "uuid",
    group_leader: {
      type: "userDefined",
      udtName: "person",
    },
    group_members: {
      type: "set",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
    group_roles: {
      type: "map",
      keyType: "text",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
  },
  // Define the primary key for the table.
  primaryKey: {
    partitionBy: ["id"],
  },
});

// Manually define the type of the table's schema and primary key
type Person = { name: string; level: number };
type TableSchema = {
  id: UUID;
  group_leader: Person;
  group_members: Set<Person>;
  group_roles: Map<string, Person>;
};
type TablePrimaryKey = Pick<TableSchema, "id">;

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<TableSchema, TablePrimaryKey>(
    "example_table",
    { definition: tableDefinition },
  );
})();

To create a table without any typing, pass SomeRow as the single generic type parameter to the createTable method. This types the table’s rows as Record<string, any>.

This is the most flexible but least type-safe option.

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    id: "uuid",
    group_leader: {
      type: "userDefined",
      udtName: "person",
    },
    group_members: {
      type: "set",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
    group_roles: {
      type: "map",
      keyType: "text",
      valueType: {
        type: "userDefined",
        udtName: "person",
      },
    },
  },
  // Define the primary key for the table.
  primaryKey: {
    partitionBy: ["id"],
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
  });
})();

The Java client supports multiple ways to create a table. In all cases, you must define the table schema.

Use a generic type
Define the row type

If you don’t specify the Class parameter when creating an instance of the generic class Table, the client defaults Table as the type. In this case, the working object type T is Row.class.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnUuid("id")
            .addColumnUserDefinedType("group_leader", "person")
            .addColumnSetUserDefinedType("group_members", "person")
            .addColumnMapUserDefinedType("group_roles", "person", TableColumnTypes.TEXT)
            // Define the primary key for the table.
            .addPartitionBy("id");
    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

Instead of using the default type Row.class, you can define your own working object, which will be serialized as a Row.

This working object can be annotated when the field names do not exactly match the column names or when you want to fully describe your table to enable its creation solely from the entity definition.

The following example defines a Book class and then uses it to create the table.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.types.TableUserDefinedType;
import com.datastax.astra.client.tables.mapping.Column;
import com.datastax.astra.client.tables.mapping.EntityTable;
import com.datastax.astra.client.tables.mapping.PartitionBy;
import java.util.Map;
import java.util.Set;
import java.util.UUID;
import lombok.Data;

public class Example {
  // Define the user-defined type "person"
  @TableUserDefinedType("person")
  public class Person {
    @Column(name = "user_name", type = TableColumnTypes.TEXT)
    private String userName;

    @Column(name = "age", type = TableColumnTypes.INT)
    private Integer age;
  }

  // Define the table
  @EntityTable("example_table")
  @Data
  class Group {
    @PartitionBy(0)
    @Column(name = "id", type = TableColumnTypes.UUID)
    private UUID id;

    @Column(name = "group_leader", type = TableColumnTypes.USERDEFINED, udtName = "person")
    private Person groupLeader;

    @Column(
        name = "group_members",
        type = TableColumnTypes.SET,
        valueType = TableColumnTypes.USERDEFINED,
        udtName = "person")
    private Set<Person> groupMembers;

    @Column(
        name = "group_roles",
        type = TableColumnTypes.MAP,
        keyType = TableColumnTypes.TEXT,
        valueType = TableColumnTypes.USERDEFINED,
        udtName = "person")
    private Map<String, Person> groupRoles;
  }

  public static void main(String[] args) {
    Database database = new DataAPIClient("APPLICATION_TOKEN").getDatabase("API_ENDPOINT");

    Table<Group> table = database.createTable(Group.class);
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "createTable": {
    "name": "example_table",
    "definition": {
      "columns": {
        "id": {
          "type": "uuid"
        },
        "group_leader": {
          "type": "userDefined",
          "udtName": "person"
        },
        "group_members": {
          "type": "set",
          "valueType": {
            "type": "userDefined",
            "udtName": "person"
          }
        },
        "group_roles": {
          "type": "map",
          "keyType": "text",
          "valueType": {
            "type": "userDefined",
            "udtName": "person"
          }
        }
      },
      "primaryKey": "id"
    }
  }
}'

Create a table and specify the keyspace

Python
TypeScript
Java
curl

The following example uses untyped documents or rows, but you can define a client-side type for your collection to help statically catch errors. For examples, see Typing support.

from astrapy import DataAPIClient
from astrapy.info import (
    CreateTableDefinition,
    ColumnType,
    TableKeyValuedColumnType,
    TableKeyValuedColumnTypeDescriptor,
    TableScalarColumnTypeDescriptor,
    TableValuedColumnTypeDescriptor,
    TableValuedColumnType,
    TablePrimaryKeyDescriptor,
)

# Get an existing database
client = DataAPIClient()
database = client.get_database("API_ENDPOINT", token="APPLICATION_TOKEN")

table_definition = CreateTableDefinition(
    # Define all of the columns in the table
    columns={
        "title": TableScalarColumnTypeDescriptor(column_type=ColumnType.TEXT),
        "number_of_pages": TableScalarColumnTypeDescriptor(column_type=ColumnType.INT),
        "rating": TableScalarColumnTypeDescriptor(column_type=ColumnType.FLOAT),
        "genres": TableValuedColumnTypeDescriptor(
            column_type=TableValuedColumnType.SET,
            value_type=ColumnType.TEXT,
        ),
        "metadata": TableKeyValuedColumnTypeDescriptor(
            column_type=TableKeyValuedColumnType.MAP,
            key_type=ColumnType.TEXT,
            value_type=ColumnType.TEXT,
        ),
        "is_checked_out": TableScalarColumnTypeDescriptor(
            column_type=ColumnType.BOOLEAN
        ),
        "due_date": TableScalarColumnTypeDescriptor(column_type=ColumnType.DATE),
    },
    # Define the primary key for the table.
    # In this case, the table uses a single-column primary key.
    primary_key=TablePrimaryKeyDescriptor(partition_by=["title"], partition_sort={}),
)

table = database.create_table(
    "example_table", definition=table_definition, keyspace="KEYSPACE_NAME"
)

import { DataAPIClient, SomeRow, Table } from "@datastax/astra-db-ts";

// Get an existing database
const client = new DataAPIClient();
const database = client.db("API_ENDPOINT", {
  token: "APPLICATION_TOKEN",
});

const tableDefinition = Table.schema({
  // Define all of the columns in the table
  columns: {
    title: "text",
    number_of_pages: "int",
    rating: "float",
    genres: { type: "set", valueType: "text" },
    metadata: {
      type: "map",
      keyType: "text",
      valueType: "text",
    },
    is_checked_out: "boolean",
    due_date: "date",
  },
  // Define the primary key for the table.
  // In this case, the table uses a single-column primary key.
  primaryKey: {
    partitionBy: ["title"],
  },
});

(async function () {
  // Provide the types and the definition to create the table
  const table = await database.createTable<SomeRow>("example_table", {
    definition: tableDefinition,
    keyspace: "KEYSPACE_NAME",
  });
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.databases.Database;
import com.datastax.astra.client.tables.Table;
import com.datastax.astra.client.tables.definition.TableDefinition;
import com.datastax.astra.client.tables.definition.columns.TableColumnTypes;
import com.datastax.astra.client.tables.definition.rows.Row;

public class Example {
  public static void main(String[] args) {
    // Get an existing database
    Database database =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT", "KEYSPACE_NAME");

    TableDefinition tableDefinition =
        new TableDefinition()
            // Define all of the columns in the table
            .addColumnText("title")
            .addColumnInt("number_of_pages")
            .addColumn("rating", TableColumnTypes.FLOAT)
            .addColumnSet("genres", TableColumnTypes.TEXT)
            .addColumnMap("metadata", TableColumnTypes.TEXT, TableColumnTypes.TEXT)
            .addColumnBoolean("is_checked_out")
            .addColumn("due_date", TableColumnTypes.DATE)
            // Define the primary key for the table.
            // In this case, the table uses a single-column primary key.
            .addPartitionBy("title");

    Table<Row> table = database.createTable("example_table", tableDefinition);
  }
}

This option has no literal equivalent in HTTP. Instead, you specify the keyspace in the path.

Client reference

Python
TypeScript
Java
curl

For more information, see the client reference.

Client reference documentation is not applicable for HTTP.

Create a table

Result

Parameters

Examples

Create a table with a single-column primary key

Create a table with a composite primary key

Create a table with a compound primary key

Create a table with a column to store vector embeddings

Create a table with a column to automatically generate vector embeddings

Create a table that uses a user-defined type (UDT)

Create a table and specify the keyspace

Client reference

Was this helpful?

Give Feedback