Insert a document

Inserts a single document into a collection.

Documents are stored in collections. They represent a single row or record of data in Astra DB Serverless databases. For more information, see About collections with the Data API.

If the collection is vector-enabled, pregenerated vector embeddings can be included by using the reserved $vector field. If the collection has vectorize enabled, vector embeddings can be automatically generated from text specified in the reserved $vectorize field. You can later use the $vector or $vectorize field to perform a vector search or hybrid search.

If the collection has lexical enabled, use the reserved $lexical field to store a string to index for the lexical search component of hybrid search.

Alternatively, you can use the $hybrid shorthand to populate the $vectorize and $lexical fields.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

Python
TypeScript
Java
curl

Inserts the specified document and returns a CollectionInsertOneResult object that includes the ID of the inserted document and details about the success of the operation.

The ID value depends on the ID type. For more information, see Document IDs.

Example response:

CollectionInsertOneResult(inserted_id='92b3c4f4-db44-4440-b4c4-f4db54e440b8', raw_results=...)

Inserts the specified document and returns a promise that resolves to a CollectionInsertOneResult<Schema> object that includes the ID of the inserted document.

The ID value depends on the ID type. For more information, see Document IDs.

Example response:

{ insertedId: '92b3c4f4-db44-4440-b4c4-f4db54e440b8' }

Inserts the specified document and returns a wrapper (CollectionInsertOneResult) that includes the ID of the inserted document.

The ID value depends on the ID type. For more information, see Document IDs.

Inserts the specified document and returns a JSON object that includes the ID of the inserted document.

The ID value depends on the ID type. For more information, see Document IDs.

Example response:

{
  "status": {
    "insertedIds": [
      "3f557bef-fd53-47ea-957b-effd53c7eaec"
    ]
  }
}

Parameters

Python
TypeScript
Java
curl

Use the insert_one method, which belongs to the astrapy.Collection class.

Method signature

insert_one(
  document: Dict[str, Any],
  *,
  general_method_timeout_ms: int,
  request_timeout_ms: int,
  timeout_ms: int,
) -> CollectionInsertOneResult

Name Type Summary

Name	Type	Summary
`document`	`Dict[str, Any]`	A dictionary describing the document to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions: Field names cannot start with `$`. Field names cannot be exactly ``. If a field name includes `&` or `.`, you must use `&` to escape these characters when reading or updating the field. You cannot use `&` to escape any other characters. Dot notation, which is used to reference nested fields, should not be escaped. The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named `price.usd` nested under a field named `costs` has a path length of 16 characters after escaping (`costs.price&.usd`). Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable: `_id`: An optional unique identifier for the document. If `_id` is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs. `$vector`: An optional array of numbers representing a vector embedding for vector search. The `$vector` field is only supported for vector-enabled collections. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vector in collections. `$vectorize`: An optional string from which to generate vector embeddings for vector search. The `$vectorize` field is only supported for collections that have an embedding provider integration. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vectorize in collections. `$lexical`: An optional string to make the document searchable for the lexical search component of hybrid search. The `$lexical` field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections. `$hybrid`: An optional string that populates both `$vectorize` and `$lexical`. The `$hybrid` shorthand is only supported for collections that have vectorize and* lexical search enabled. If a document uses `$hybrid`, it cannot contain a root-level `$vectorize` or `$lexical` field. For more information, see $hybrid in collections.
`general_method_timeout_ms`	`int`	Optional. The maximum time, in milliseconds, that the client should wait for the underlying HTTP request. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. For more information, see Timeout options.
`request_timeout_ms`	`int`	Optional. An alias for `general_method_timeout_ms`. Since this method issues a single HTTP request, `general_method_timeout_ms` and `request_timeout_ms` are equivalent.
`timeout_ms`	`int`	Optional. An alias for `general_method_timeout_ms`.

document

Dict[str, Any]

A dictionary describing the document to insert.

A document can contain user-defined and reserved fields.

User-defined field names can be any sequence of Unicode characters, with the following exceptions:

Field names cannot start with $.
Field names cannot be exactly *.
If a field name includes & or ., you must use & to escape these characters when reading or updating the field. You cannot use & to escape any other characters.

Dot notation, which is used to reference nested fields, should not be escaped.

The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named price.usd nested under a field named costs has a path length of 16 characters after escaping (costs.price&.usd).

Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:

_id: An optional unique identifier for the document. If _id is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs.
$vector: An optional array of numbers representing a vector embedding for vector search. The $vector field is only supported for vector-enabled collections. A document cannot contain both a $vector and a $vectorize field. For more information, see $vector in collections.
$vectorize: An optional string from which to generate vector embeddings for vector search. The $vectorize field is only supported for collections that have an embedding provider integration. A document cannot contain both a $vector and a $vectorize field. For more information, see $vectorize in collections.
$lexical: An optional string to make the document searchable for the lexical search component of hybrid search. The $lexical field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections.
$hybrid: An optional string that populates both $vectorize and $lexical. The $hybrid shorthand is only supported for collections that have vectorize and lexical search enabled. If a document uses $hybrid, it cannot contain a root-level $vectorize or $lexical field. For more information, see $hybrid in collections.

general_method_timeout_ms

int

Optional. The maximum time, in milliseconds, that the client should wait for the underlying HTTP request.

Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object. For more information, see Timeout options.

request_timeout_ms

int

Optional. An alias for general_method_timeout_ms. Since this method issues a single HTTP request, general_method_timeout_ms and request_timeout_ms are equivalent.

timeout_ms

int

Optional. An alias for general_method_timeout_ms.

Use the insertOne method, which belongs to the Collection class.

Method signature

async insertOne(
  document: MaybeId<Schema>,
  options?: {
    timeout?: number | TimeoutDescriptor,
  },
): CollectionInsertOneResult<Schema>

Name Type Summary

Name	Type	Summary
`document`	`MaybeId<Schema>`	An object describing the document to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions: Field names cannot start with `$`. Field names cannot be exactly ``. If a field name includes `&` or `.`, you must use `&` to escape these characters when reading or updating the field. You cannot use `&` to escape any other characters. Dot notation, which is used to reference nested fields, should not be escaped. The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named `price.usd` nested under a field named `costs` has a path length of 16 characters after escaping (`costs.price&.usd`). Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable: `_id`: An optional unique identifier for the document. If `_id` is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs. `$vector`: An optional array of numbers representing a vector embedding for vector search. The `$vector` field is only supported for vector-enabled collections. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vector in collections. `$vectorize`: An optional string from which to generate vector embeddings for vector search. The `$vectorize` field is only supported for collections that have an embedding provider integration. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vectorize in collections. `$lexical`: An optional string to make the document searchable for the lexical search component of hybrid search. The `$lexical` field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections. `$hybrid`: An optional string that populates both `$vectorize` and `$lexical`. The `$hybrid` shorthand is only supported for collections that have vectorize and* lexical search enabled. If a document uses `$hybrid`, it cannot contain a root-level `$vectorize` or `$lexical` field. For more information, see $hybrid in collections.
`options`	`CollectionInsertOneOptions`	Optional. The options for this operation. See Properties of `options` for more details.

document

MaybeId<Schema>

An object describing the document to insert.

A document can contain user-defined and reserved fields.

User-defined field names can be any sequence of Unicode characters, with the following exceptions:

Field names cannot start with $.
Field names cannot be exactly *.
If a field name includes & or ., you must use & to escape these characters when reading or updating the field. You cannot use & to escape any other characters.

Dot notation, which is used to reference nested fields, should not be escaped.

The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named price.usd nested under a field named costs has a path length of 16 characters after escaping (costs.price&.usd).

Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:

_id: An optional unique identifier for the document. If _id is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs.
$vector: An optional array of numbers representing a vector embedding for vector search. The $vector field is only supported for vector-enabled collections. A document cannot contain both a $vector and a $vectorize field. For more information, see $vector in collections.
$vectorize: An optional string from which to generate vector embeddings for vector search. The $vectorize field is only supported for collections that have an embedding provider integration. A document cannot contain both a $vector and a $vectorize field. For more information, see $vectorize in collections.
$lexical: An optional string to make the document searchable for the lexical search component of hybrid search. The $lexical field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections.
$hybrid: An optional string that populates both $vectorize and $lexical. The $hybrid shorthand is only supported for collections that have vectorize and lexical search enabled. If a document uses $hybrid, it cannot contain a root-level $vectorize or $lexical field. For more information, see $hybrid in collections.

options

CollectionInsertOneOptions

Optional. The options for this operation. See Properties of options for more details.

Properties of `options`
Name	Type	Summary
`timeout`	`number` \| `TimeoutDescriptor`	Optional. The timeout(s) to apply to this method. You can specify `requestTimeoutMs` and `generalMethodTimeoutMs`. Since this method issues a single HTTP request, these timeouts are equivalent. Details about the `timeout` parameter The `TimeoutDescriptor` object can contain these properties: `requestTimeoutMs` (`number`): The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. `generalMethodTimeoutMs` (`number`): The maximum time, in milliseconds, that the whole operation can take. Since this method issues a single HTTP request, `generalMethodTimeoutMs` and `requestTimeoutMs` are equivalent. If you specify both, the minimum of the two will be used. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. If you specify a number instead of a `TimeoutDescriptor` object, that number will be applied to both `requestTimeoutMs` and `generalMethodTimeoutMs`.

Use the insertOne method, which belongs to the com.datastax.astra.client.Collection class.

Method signature

CollectionInsertOneResult insertOne(
  T document
)

CollectionInsertOneResult insertOne(
  T document,
  CollectionInsertOneOptions options
)

Name Type Summary

Name	Type	Summary
`document`	`T`	An object describing the document to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions: Field names cannot start with `$`. Field names cannot be exactly ``. If a field name includes `&` or `.`, you must use `&` to escape these characters when reading or updating the field. You cannot use `&` to escape any other characters. Dot notation, which is used to reference nested fields, should not be escaped. The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named `price.usd` nested under a field named `costs` has a path length of 16 characters after escaping (`costs.price&.usd`). Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable: `_id`: An optional unique identifier for the document. If `_id` is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs. `$vector`: An optional array of numbers representing a vector embedding for vector search. The `$vector` field is only supported for vector-enabled collections. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vector in collections. `$vectorize`: An optional string from which to generate vector embeddings for vector search. The `$vectorize` field is only supported for collections that have an embedding provider integration. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vectorize in collections. `$lexical`: An optional string to make the document searchable for the lexical search component of hybrid search. The `$lexical` field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections. `$hybrid`: An optional string that populates both `$vectorize` and `$lexical`. The `$hybrid` shorthand is only supported for collections that have vectorize and* lexical search enabled. If a document uses `$hybrid`, it cannot contain a root-level `$vectorize` or `$lexical` field. For more information, see $hybrid in collections.
`options`	`CollectionInsertOneOptions`	Optional. The options to apply to the insert operation.

document

T

An object describing the document to insert.

A document can contain user-defined and reserved fields.

User-defined field names can be any sequence of Unicode characters, with the following exceptions:

Field names cannot start with $.
Field names cannot be exactly *.
If a field name includes & or ., you must use & to escape these characters when reading or updating the field. You cannot use & to escape any other characters.

Dot notation, which is used to reference nested fields, should not be escaped.

The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named price.usd nested under a field named costs has a path length of 16 characters after escaping (costs.price&.usd).

Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:

_id: An optional unique identifier for the document. If _id is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs.
$vector: An optional array of numbers representing a vector embedding for vector search. The $vector field is only supported for vector-enabled collections. A document cannot contain both a $vector and a $vectorize field. For more information, see $vector in collections.
$vectorize: An optional string from which to generate vector embeddings for vector search. The $vectorize field is only supported for collections that have an embedding provider integration. A document cannot contain both a $vector and a $vectorize field. For more information, see $vectorize in collections.
$lexical: An optional string to make the document searchable for the lexical search component of hybrid search. The $lexical field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections.
$hybrid: An optional string that populates both $vectorize and $lexical. The $hybrid shorthand is only supported for collections that have vectorize and lexical search enabled. If a document uses $hybrid, it cannot contain a root-level $vectorize or $lexical field. For more information, see $hybrid in collections.

options

CollectionInsertOneOptions

Optional. The options to apply to the insert operation.

Use the insertOne command.

Command signature

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "insertOne": {
    "document": DOCUMENT_JSON_OBJECT
  }
}'

Name Type Summary

Name	Type	Summary
`document`	`object`	A JSON object describing the document to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions: Field names cannot start with `$`. Field names cannot be exactly ``. If a field name includes `&` or `.`, you must use `&` to escape these characters when reading or updating the field. You cannot use `&` to escape any other characters. Dot notation, which is used to reference nested fields, should not be escaped. The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named `price.usd` nested under a field named `costs` has a path length of 16 characters after escaping (`costs.price&.usd`). Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable: `_id`: An optional unique identifier for the document. If `_id` is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs. `$vector`: An optional array of numbers representing a vector embedding for vector search. The `$vector` field is only supported for vector-enabled collections. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vector in collections. `$vectorize`: An optional string from which to generate vector embeddings for vector search. The `$vectorize` field is only supported for collections that have an embedding provider integration. A document cannot contain both a `$vector` and a `$vectorize` field. For more information, see $vectorize in collections. `$lexical`: An optional string to make the document searchable for the lexical search component of hybrid search. The `$lexical` field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections. `$hybrid`: An optional string that populates both `$vectorize` and `$lexical`. The `$hybrid` shorthand is only supported for collections that have vectorize and* lexical search enabled. If a document uses `$hybrid`, it cannot contain a root-level `$vectorize` or `$lexical` field. For more information, see $hybrid in collections.

document

object

A JSON object describing the document to insert.

A document can contain user-defined and reserved fields.

User-defined field names can be any sequence of Unicode characters, with the following exceptions:

Field names cannot start with $.
Field names cannot be exactly *.
If a field name includes & or ., you must use & to escape these characters when reading or updating the field. You cannot use & to escape any other characters.

Dot notation, which is used to reference nested fields, should not be escaped.

The total path length cannot exceed 1000 characters, including dot notation and escaping. For example, a field named price.usd nested under a field named costs has a path length of 16 characters after escaping (costs.price&.usd).

Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:

_id: An optional unique identifier for the document. If _id is omitted, it is created automatically based on the collection’s ID type. For more information, see Document IDs.
$vector: An optional array of numbers representing a vector embedding for vector search. The $vector field is only supported for vector-enabled collections. A document cannot contain both a $vector and a $vectorize field. For more information, see $vector in collections.
$vectorize: An optional string from which to generate vector embeddings for vector search. The $vectorize field is only supported for collections that have an embedding provider integration. A document cannot contain both a $vector and a $vectorize field. For more information, see $vectorize in collections.
$lexical: An optional string to make the document searchable for the lexical search component of hybrid search. The $lexical field is only supported for collections that have lexical search enabled. For more information, see $lexical in collections.
$hybrid: An optional string that populates both $vectorize and $lexical. The $hybrid shorthand is only supported for collections that have vectorize and lexical search enabled. If a document uses $hybrid, it cannot contain a root-level $vectorize or $lexical field. For more information, see $hybrid in collections.

Examples

The following examples demonstrate how to insert a document into a collection.

Insert a document

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document into the collection
result = collection.insert_one({"name": "Jane Doe", "age": 42})

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    age: 42,
  });
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertOneResult;
import com.datastax.astra.client.collections.definition.documents.Document;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Insert a document into the collection
    Document document = new Document().append("name", "Jane Doe").append("age", 42);
    CollectionInsertOneResult result = collection.insertOne(document);
    System.out.println(result.getInsertedId());
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "purchase_type": "Online",
      "customer": {
        "name": "Jim A.",
        "phone": "123-456-1111",
        "age": 51,
        "credit_score": 782,
        "address": {
          "street": "1234 Broadway",
          "city": "New York",
          "state": "NY"
        }
      },
      "purchase_date": { "$date": 1690045891 },
      "seller": {
        "name": "Jon B.",
        "location": "Manhattan NYC"
      },
      "items": [
        {
          "car": "BMW 330i Sedan",
          "color": "Silver"
        },
        "Extended warranty - 5 years"
      ],
      "amount": 47601,
      "status": "active",
      "preferred_customer": true
    }
  }
}'

Insert a document with vector embeddings

Use the reserved $vector field to insert a document with pregenerated vector embeddings.

You can later use this field to perform a vector search.

All embeddings in the collection should use the same provider, model, and dimensions. Mismatched embeddings can cause inaccurate vector searches.

The $vector field is only supported for vector-enabled collections. For more information, see Create a collection that can store vector embeddings and $vector in collections.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document into the collection
result = collection.insert_one(
    {
        "name": "Jane Doe",
        "$vector": [0.08, 0.68, 0.30],
    },
)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    $vector: [0.08, 0.68, 0.3],
  });
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertOneResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.vector.DataAPIVector;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Insert a document into the collection
    Document document =
        new Document()
            .append("name", "Jane Doe")
            .append("$vector", new DataAPIVector(new float[] {.08f, .68f, .30f}));
    CollectionInsertOneResult result = collection.insertOne(document);
    System.out.println(result.getInsertedId());
  }
}

You can provide the vector embeddings as an array of floats, or you can use $binary to provide the vector embeddings as a Base64-encoded string. $binary can be more performant.

Array of floats
$binary

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$vector": [.12, .52, .32]
    }
  }
}'

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$vector": {"$binary": "PfXCjz8FHrg+o9cK"}
    }
  }
}'

Insert a document and generate vector embeddings

Use the reserved $vectorize field to generate a vector embedding automatically. The value of $vectorize can be any string.

You can later use this field to perform a vector search.

The $vectorize field is only supported for collections that have vectorize enabled. For more information, see Create a collection that can automatically generate vector embeddings and $vectorize in collections.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document into the collection
result = collection.insert_one(
    {
        "name": "Jane Doe",
        "$vectorize": "Text to vectorize",
    },
)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    $vectorize: "Text to vectorize",
  });
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertOneResult;
import com.datastax.astra.client.collections.definition.documents.Document;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Insert a document into the collection
    Document document =
        new Document().append("name", "Jane Doe").append("$vectorize", "Text to vectorize");
    CollectionInsertOneResult result = collection.insertOne(document);
    System.out.println(result.getInsertedId());
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$vectorize": "Text to vectorize"
    }
  }
}'

Insert a document for retrieval with hybrid search

Hybrid search, lexical search, and reranking are currently in public preview. Development is ongoing, and the features and functionality are subject to change. Astra DB Serverless, and the use of such, is subject to the DataStax Preview Terms.

If you plan to use hybrid search to find this document, the document must have both the $lexical field and the $vector field populated.

Python
TypeScript
Java
curl

Example specifying the $vector and $lexical fields:

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document
collection.insert_one(
    {
        "name": "Jane Doe",
        "$vector": [0.08, 0.68, 0.30],
        "$lexical": "An athlete who loves biking, hiking, running, and swimming in the outdoors",
    },
)

Example specifying the $vectorize and $lexical fields:

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document
collection.insert_one(
    {
        "name": "Jane Doe",
        "$vectorize": "An athlete who loves biking, hiking, running, and swimming in the outdoors",
        "$lexical": "She shares her love of triathlons by coaching kids after school.",
    },
)

Example using the $hybrid shorthand, which populates the $lexical and $vectorize field:

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document
collection.insert_one(
    {
        "name": "Jane Doe",
        "$hybrid": "An athlete who loves biking, hiking, running, and swimming in the outdoors",
    },
)

Example specifying the $vector and $lexical fields:

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    $vector: [0.08, 0.68, 0.3],
    $lexical:
      "An athlete who loves biking, hiking, running, and swimming in the outdoors",
  });
})();

Example specifying the $vectorize and $lexical fields:

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    $vectorize:
      "An athlete who loves biking, hiking, running, and swimming in the outdoors",
    $lexical:
      "She shares her love of triathlons by coaching kids after school.",
  });
})();

Example using the $hybrid shorthand, which populates the $lexical and $vectorize field:

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    name: "Jane Doe",
    $hybrid:
      "An athlete who loves biking, hiking, running, and swimming in the outdoors",
  });
})();

Example specifying the $vector and $lexical fields:

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.vector.DataAPIVector;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    Document document =
        new Document()
            .append("name", "John Doe")
            .append("$vector", new DataAPIVector(new float[] {0.45f, 0.32f, 0.41f}))
            .append(
                "$lexical",
                "An athlete who loves biking, hiking, running, and swimming in the outdoors.");

    collection.insertOne(document);
  }
}

Example specifying the $vectorize and $lexical fields:

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.documents.Document;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    Document document =
        new Document()
            .append("name", "Mary Day")
            .append(
                "$vectorize",
                "An athlete who loves biking, hiking, running, and swimming in the outdoors")
            .append("$lexical", "She shares her love of triathlons by coaching kids after school.");

    collection.insertOne(document);
  }
}

Example using the $hybrid shorthand, which populates the $lexical and $vectorize field:

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.definition.documents.Document;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    Document document =
        new Document()
            .append("name", "'Bobby'")
            .append(
                "$hybrid",
                "An athlete who loves biking, hiking, running, and swimming in the outdoors");

    collection.insertOne(document);
  }
}

Example specifying the $vector and $lexical fields:

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$vector": [.08, .68, .30],
      "$lexical": "She shares her love of triathlons by coaching kids after school."
    }
  }
}'

Example specifying the $vectorize and $lexical fields:

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$vectorize": "An athlete who loves biking, hiking, running, and swimming in the outdoors",
      "$lexical": "She shares her love of triathlons by coaching kids after school."
    }
  }
}'

Example using the $hybrid shorthand, which populates the $lexical and $vectorize field:

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "$hybrid": "An athlete who loves biking, hiking, running, and swimming in the outdoors"
    }
  }
}'

Insert a document and specify the ID

Python
TypeScript
Java
curl

from astrapy import DataAPIClient


# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Insert a document into the collection
result = collection.insert_one(
    {
        "_id": 1,
        "name": "Jane Doe",
    },
)

The TypeScript client provides the UUID and ObjectId classes to use and generate identifiers. These are not the same as those exported from the uuid or bson libraries.

To generate new identifiers, you can use UUID.v1(), UUID.v4(), UUID.v6(), UUID.v7(), or new ObjectId(). UUIDs can also be constructed from a string representation of the IDs. You can also use the uuid and oid shorthand methods. You can also directly specify a value.

All UUID methods return an instance of the same class, which exposes a version property.

Example using new UUID.v7():

import { DataAPIClient, UUID } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    _id: UUID.v7(),
    name: "Jane Doe",
  });
})();

Example using new ObjectId() with input:

import { DataAPIClient, ObjectId } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    _id: new ObjectId("65fd9b52d7fabba03349d013"),
    name: "Jane Doe",
  });
})();

Example specifying the ID as a string:

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Insert a document into the collection
(async function () {
  const result = await collection.insertOne({
    _id: "1",
    name: "Jane Doe",
  });
})();

The Java client defines dedicated UUIDv6, UUIDv7, and ObjectId() classes. UUIDs from the Java UUID class are implemented in the UUID v4 standard. ObjectId classes are extracted from the BSON package.

When a unique identifier is retrieved from the server, it is converted to the appropriate class, based on the class definition in the defaultId option for the collection.

To generate new identifiers, you can use methods like new UUIDv6(), new UUIDv7(), or new ObjectId().

Example using new UUIDv7(), if the collection has UUIDv7 set as its default ID:

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertOneResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.collections.definition.documents.types.UUIDv7;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Insert a document into the collection
    Document document =
        new Document().id(new UUIDv7()).append("name", "Jane Doe").append("age", 42);
    CollectionInsertOneResult result = collection.insertOne(document);
    System.out.println(result.getInsertedId());
  }
}

Example specifying a string:

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertOneResult;
import com.datastax.astra.client.collections.definition.documents.Document;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Insert a document into the collection
    Document document = new Document().id("1").append("name", "Jane Doe").append("age", 42);
    CollectionInsertOneResult result = collection.insertOne(document);
    System.out.println(result.getInsertedId());
  }
}

You can specify the _id field directly, or you can use the objectId, uuid, uuidv6, or uuidv7 types.

Example specifying a string:

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "_id": "1"
    }
  }
}'

Example using the objectId type:

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "insertOne": {
    "document": {
      "name": "Jane Doe",
      "_id": { "$objectId": "6672e1cbd7fabb4e5493916f" }
    }
  }
}'

Client reference

Python
TypeScript
Java
curl

For more information, see the client reference.

Client reference documentation is not applicable for HTTP.

Insert a document

Result

Parameters

Examples

Insert a document

Insert a document with vector embeddings

Insert a document and generate vector embeddings

Insert a document for retrieval with hybrid search

Insert a document and specify the ID

Client reference

Was this helpful?

Give Feedback