Insert documents
Inserts multiple documents into a collection.
Documents are stored in collections. They represent a single row or record of data in Hyper-Converged Database (HCD) databases. For more information, see About collections with the Data API.
If the collection is vector-enabled, pregenerated vector embeddings can be included by using the reserved $vector field for each document.
You can later use the $vector field to perform a vector search.
|
Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart. |
Result
-
Python
-
TypeScript
-
Java
-
curl
Inserts the specified documents and returns a CollectionInsertManyResult object that includes the IDs of the inserted documents and details about the operation.
The ID value depends on the ID type. For more information, see Document IDs.
Example response:
CollectionInsertManyResult(inserted_ids=[
"3f557bef-fd53-47ea-957b-effd53c7eaec",
101,
"132ffr343"
], raw_results=...)
Inserts the specified documents and returns a promise that resolves to a CollectionInsertManyResult<Schema> object that includes the IDs of the inserted documents and the number of inserted documents.
The ID value depends on the ID type. For more information, see Document IDs.
Example response:
{
insertedCount: 3,
insertedIds: [
'92b3c4f4-db44-4440-b4c4-f4db54e440b8',
101,
'132ffr343',
]
}
Inserts the specified documents and returns a wrapper (CollectionInsertManyResult) that includes the IDs of the inserted documents.
The ID value depends on the ID type. For more information, see Document IDs.
Inserts the specified documents and returns a JSON object that includes the IDs of the inserted documents.
The ID value depends on the ID type. For more information, see Document IDs.
Example response:
{
"status": {
"insertedIds": [
"3f557bef-fd53-47ea-957b-effd53c7eaec",
101,
"132ffr343"
]
}
}
Parameters
-
Python
-
TypeScript
-
Java
-
curl
Use the insert_many method, which belongs to the astrapy.Collection class.
Method signature
insert_many(
documents: Iterable[Dict[str, Any]],
*,
ordered: bool,
chunk_size: int,
concurrency: int
general_method_timeout_ms: int,
request_timeout_ms: int,
timeout_ms: int,
) -> CollectionInsertManyResult
| Name | Type | Summary |
|---|---|---|
|
|
An iterable of dictionaries, with each dictionary describing a document to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions:
Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:
|
|
|
Optional.
Whether the insertions must be processed sequentially.
If Default: |
|
|
Optional. The number of documents to include in a single API request. DataStax recommends leaving this parameter unspecified to use the system default. Maximum: Default: |
|
|
Optional. The maximum number of concurrent requests to the API at a given time. If Default: |
|
|
Optional. The maximum time, in milliseconds, that the whole operation, which might involve multiple HTTP requests, can take. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the This parameter is aliased as |
|
|
Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the |
Use the insertMany method, which belongs to the Collection class.
Method signature
async insertMany(
documents: MaybeId<Schema>[],
options?: {
ordered?: boolean,
concurrency?: number,
chunkSize?: number,
timeout?: number | TimeoutDescriptor,
},
): CollectionInsertManyResult<Schema>
| Name | Type | Summary |
|---|---|---|
|
An array of documents to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions:
Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:
|
|
|
Optional.
The options for this operation. See Properties of |
| Name | Type | Summary |
|---|---|---|
|
Optional.
Whether the insertions must be processed sequentially.
If |
|
|
Optional. The maximum number of concurrent requests to the API at a given time. If Default: |
|
|
Optional. The number of documents to include in a single API request. DataStax recommends leaving this parameter unspecified to use the system default. Maximum: Default: |
|
|
|
Optional. The timeout(s) to apply to this method.
You can specify Details about the
|
Use the insertMany method, which belongs to the com.datastax.astra.client.Collection class.
Method signature
CollectionInsertManyResult insertMany(
List<? extends T> documents
)
CollectionInsertManyResult insertMany(
List<? extends T> documents,
CollectionInsertManyOptions options
)
| Name | Type | Summary |
|---|---|---|
|
|
A list of objects describing the documents to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions:
Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:
|
|
Optional.
The options for this operation. See Methods of the |
| Name | Type | Summary |
|---|---|---|
|
|
Optional.
Whether the insertions must be processed sequentially.
If |
|
|
Optional. The maximum number of concurrent requests to the API at a given time. If Default: |
|
|
Optional. The number of documents to include in a single API request. DataStax recommends leaving this parameter unspecified to use the system default. Maximum: Default: |
|
|
Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the |
Use the insertMany command.
Command signature
curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": DOCUMENTS_JSON_ARRAY,
"options": {
"ordered": BOOLEAN,
}
}
}'
| Name | Type | Summary |
|---|---|---|
|
|
An array of JSON objects describing the documents to insert. A document can contain user-defined and reserved fields. User-defined field names can be any sequence of Unicode characters, with the following exceptions:
Reserved fields are tied to specific functionality. Include the following reserved fields in your documents, if applicable:
|
|
|
Optional.
The options for this operation. See Properties of |
| Name | Type | Summary |
|---|---|---|
|
|
Optional.
Whether the insertions must be processed sequentially.
If Default: |
Examples
The following examples demonstrate how to insert multiple documents into a collection.
Insert documents
The documents can have different structures.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.authentication import UsernamePasswordTokenProvider
from astrapy.constants import Environment
# Get an existing collection
client = DataAPIClient(environment=Environment.HCD)
database = client.get_database(
"API_ENDPOINT",
token=UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
)
collection = database.get_collection(
"COLLECTION_NAME", keyspace="KEYSPACE_NAME"
)
# Insert documents into the collection
result = collection.insert_many(
[
{
"name": "Jane Doe",
"age": 42,
},
{
"nickname": "Bobby",
"color": "blue",
"foods": ["carrots", "chocolate"],
},
]
)
import {
DataAPIClient,
CollectionInsertManyError,
UsernamePasswordTokenProvider,
} from "@datastax/astra-db-ts";
// Get an existing collection
const client = new DataAPIClient({ environment: "hcd" });
const database = client.db("API_ENDPOINT", {
token: new UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
});
const collection = database.collection("COLLECTION_NAME", {
keyspace: "KEYSPACE_NAME",
});
// Insert documents into the collection
(async function () {
try {
const result = await collection.insertMany([
{
name: "Jane Doe",
age: 42,
},
{
nickname: "Bobby",
color: "blue",
foods: ["carrots", "chocolate"],
},
]);
} catch (error) {
if (error instanceof CollectionInsertManyError) {
console.log(error.insertedIds());
}
}
})();
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertManyResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.databases.Database;
import java.util.Arrays;
import java.util.List;
public class Example {
public static void main(String[] args) {
// Get an existing collection
DataAPIClient client = DataAPIClients.clientHCD("USERNAME", "PASSWORD");
Database database = client.getDatabase("API_ENDPOINT", "KEYSPACE_NAME");
Collection<Document> collection = database.getCollection("COLLECTION_NAME");
// Insert documents to the collection
Document document1 = new Document().append("name", "Jane Doe").append("age", 42);
Document document2 =
new Document()
.append("nickname", "Bobby")
.append("color", "blue")
.append("foods", Arrays.asList("carrots", "chocolate"));
CollectionInsertManyResult result = collection.insertMany(List.of(document1, document2));
System.out.println("IDs inserted: " + result.getInsertedIds());
}
}
curl -sS -L -X POST "API_ENDPOINT/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"name": "Jane Doe",
"age": 42
},
{
"nickname": "Bobby",
"color": "blue",
"foods": ["carrots", "chocolate"]
}
]
}
}'
Insert documents with vector embeddings
Use the reserved $vector field to insert documents with pregenerated vector embeddings.
All embeddings in the collection should use the same provider, model, and dimensions. Mismatched embeddings can cause inaccurate vector searches.
The $vector field is only supported for vector-enabled collections.
For more information, see Create a collection that can store vector embeddings and $vector in collections.
You may also insert a mix of documents with and without the $vector field.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.authentication import UsernamePasswordTokenProvider
from astrapy.constants import Environment
from astrapy.data_types import DataAPIVector
# Get an existing collection
client = DataAPIClient(environment=Environment.HCD)
database = client.get_database(
"API_ENDPOINT",
token=UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
)
collection = database.get_collection(
"COLLECTION_NAME", keyspace="KEYSPACE_NAME"
)
# Insert documents to the collection
# The following also demonstrates use of both plain lists and DataAPIVector
result = collection.insert_many(
[
{"name": "Jane Doe", "age": 42, "$vector": [0.12, -0.46, 0.35, 0.52, -0.32]},
{
"nickname": "Bobby",
"$vector": DataAPIVector([0.12, -0.46, 0.35, 0.52, -0.32]),
},
]
)
import {
DataAPIClient,
CollectionInsertManyError,
UsernamePasswordTokenProvider,
} from "@datastax/astra-db-ts";
// Get an existing collection
const client = new DataAPIClient({ environment: "hcd" });
const database = client.db("API_ENDPOINT", {
token: new UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
});
const collection = database.collection("COLLECTION_NAME", {
keyspace: "KEYSPACE_NAME",
});
// Insert documents into the collection
(async function () {
try {
const result = await collection.insertMany([
{
name: "Jane Doe",
age: 42,
$vector: [0.12, -0.46, 0.35, 0.52, -0.32],
},
{
nickname: "Bobby",
$vector: [0.12, -0.46, 0.35, 0.52, -0.32],
},
]);
} catch (error) {
if (error instanceof CollectionInsertManyError) {
console.log(error.insertedIds());
}
}
})();
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertManyResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.vector.DataAPIVector;
import com.datastax.astra.client.databases.Database;
import java.util.List;
public class Example {
public static void main(String[] args) {
// Get an existing collection
DataAPIClient client = DataAPIClients.clientHCD("USERNAME", "PASSWORD");
Database database = client.getDatabase("API_ENDPOINT", "KEYSPACE_NAME");
Collection<Document> collection = database.getCollection("COLLECTION_NAME");
// Insert documents to the collection
Document document1 =
new Document()
.append("name", "Jane Doe")
.append("age", 42)
.append(
"$vector", new DataAPIVector(new float[] {0.12f, -0.46f, 0.35f, 0.52f, -0.32f}));
Document document2 =
new Document()
.append("nickname", "Bobby")
.append(
"$vector", new DataAPIVector(new float[] {-0.78f, -0.59f, 0.42f, 0.96f, -0.14f}));
CollectionInsertManyResult result = collection.insertMany(List.of(document1, document2));
System.out.println("IDs inserted: " + result.getInsertedIds());
}
}
You can provide the vector embeddings as an array of floats, or you can use $binary to provide the vector embeddings as a Base64-encoded string.
$binary can be more performant.
Vector binary encodings specification
A d-dimensional vector is a list of d floating-point numbers that can be binary encoded.
To prepare for encoding, the list must be transformed into a sequence of bytes where each float is represented as four bytes in big-endian format.
Then, the byte sequence is Base64-encoded, with = padding, if needed.
For example, here are some vectors and their resulting Base64 encoded strings:
[0.1, -0.2, 0.3] = "PczMzb5MzM0+mZma" [0.1, 0.2] = "PczMzT5MzM0=" [10, 10.5, 100, -91.19] = "QSAAAEEoAABCyAAAwrZhSA=="
Once encoded, you use $binary to pass the Base64 string to the Data API:
{ "$binary": "BASE64_STRING" }
You can use a script to encode your vectors, for example:
python
import base64
import struct
input_vector = [0.1, -0.2, 0.3]
d = len(input_vector)
pack_format = ">" + "f" * d
binary_encode = base64.b64encode(struct.pack(pack_format, *input_vector)).decode()
-
Array of floats
-
$binary
curl -sS -L -X POST "API_ENDPOINT/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"name": "Jane Doe",
"age": 42,
"$vector": [0.12, -0.46, 0.35, 0.52, -0.32]
},
{
"nickname": "Bobby",
"$vector": [0.12, -0.46, 0.35, 0.52, -0.32]
}
]
}
}'
curl -sS -L -X POST "API_ENDPOINT/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"name": "Jane Doe",
"age": 42,
"$vector": {"$binary": "PfXCjz8FHrg+o9cK"}
},
{
"nickname": "Bobby",
"$vector": {"$binary": "PpmZmj8ZmZo/AAAA"}
}
]
}
}'
Insert documents and specify the IDs
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.authentication import UsernamePasswordTokenProvider
from astrapy.constants import Environment
# Get an existing collection
client = DataAPIClient(environment=Environment.HCD)
database = client.get_database(
"API_ENDPOINT",
token=UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
)
collection = database.get_collection(
"COLLECTION_NAME", keyspace="KEYSPACE_NAME"
)
# Insert documents into the collection
result = collection.insert_many(
[
{
"name": "Jane Doe",
"_id": 1,
},
{
"nickname": "Bobby",
"_id": "b_023",
},
]
)
The TypeScript client provides the UUID and ObjectId classes to use and generate identifiers.
These are not the same as those exported from the uuid or bson libraries.
To generate new identifiers, you can use UUID.v1(), UUID.v4(), UUID.v6(), UUID.v7(), or new ObjectId(), or you can use the uuid and oid shorthand methods.
These methods accept a string representation of the IDs.
All UUID methods return an instance of the same class, which exposes a version property.
import {
DataAPIClient,
CollectionInsertManyError,
UUID,
ObjectId,
uuid,
oid,
UsernamePasswordTokenProvider,
} from "@datastax/astra-db-ts";
// Get an existing collection
const client = new DataAPIClient({ environment: "hcd" });
const database = client.db("API_ENDPOINT", {
token: new UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
});
const collection = database.collection("COLLECTION_NAME", {
keyspace: "KEYSPACE_NAME",
});
// Insert documents into the collection
(async function () {
try {
const result = await collection.insertMany([
{
name: "Melissa",
_id: new ObjectId(),
},
{
name: "Jess",
_id: new ObjectId("65fd9b52d7fabba03349d013"),
},
{
name: "Adam",
_id: UUID.v4(),
},
{
name: "Beth",
_id: new UUID("016b1cac-14ce-660e-8974-026c927b9b91"),
},
{
name: "Cathy",
_id: uuid("bb3def0c-2ff2-43e1-b346-6cf0e5e36f10"),
},
{
name: "Debra",
_id: oid("67ea409a5e6499dabe0831bc"),
},
{
name: "Jane",
_id: 1,
},
{
nickname: "Bobby",
_id: "23",
},
]);
} catch (error) {
if (error instanceof CollectionInsertManyError) {
console.log(error.insertedIds());
}
}
})();
The Java client defines dedicated UUIDv6, UUIDv7, and ObjectId() classes.
UUIDs from the Java UUID class are implemented in the UUID v4 standard.
ObjectId classes are extracted from the BSON package.
When a unique identifier is retrieved from the server, it is converted to the appropriate class, based on the class definition in the defaultId option for the collection.
To generate new identifiers, you can use methods like new UUIDv6(), new UUIDv7(), or new ObjectId().
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.results.CollectionInsertManyResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.collections.definition.documents.types.ObjectId;
import com.datastax.astra.client.collections.definition.documents.types.UUIDv7;
import com.datastax.astra.client.databases.Database;
import java.util.List;
import java.util.UUID;
public class Example {
public static void main(String[] args) {
// Get an existing collection
DataAPIClient client = DataAPIClients.clientHCD("USERNAME", "PASSWORD");
Database database = client.getDatabase("API_ENDPOINT", "KEYSPACE_NAME");
Collection<Document> collection = database.getCollection("COLLECTION_NAME");
// Insert documents to the collection
Document document1 =
new Document()
.append("_id", new ObjectId("6672e1cbd7fabb4e5493916f"))
.append("name", "Melissa");
Document document2 = new Document().append("_id", new UUIDv7()).append("name", "Jess");
Document document3 = new Document().append("_id", UUID.randomUUID()).append("name", "Sam");
Document document4 = new Document().append("_id", "1").append("name", "Jane");
Document document5 = new Document().append("_id", "23").append("nickname", "Bobby");
CollectionInsertManyResult result =
collection.insertMany(List.of(document1, document2, document3, document4, document5));
System.out.println("IDs inserted: " + result.getInsertedIds());
}
}
You can specify the _id field directly, or you can use the objectId, uuid, uuidv6, or uuidv7 types.
curl -sS -L -X POST "API_ENDPOINT/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"nickname": "Melissa",
"_id": { "$objectId": "6672e1cbd7fabb4e5493916f" }
},
{
"nickname": "Jess",
"_id": { "$uuid": "1ef2e42c-1fdb-6ad6-aae4-e84679831739" }
},
{
"name": "Jane",
"_id": 1
},
{
"nickname": "Bobby",
"_id": "23"
}
]
}
}'
Insert documents and specify insertion behavior
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.authentication import UsernamePasswordTokenProvider
from astrapy.constants import Environment
# Get an existing collection
client = DataAPIClient(environment=Environment.HCD)
database = client.get_database(
"API_ENDPOINT",
token=UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
)
collection = database.get_collection(
"COLLECTION_NAME", keyspace="KEYSPACE_NAME"
)
# Insert documents into the collection
result = collection.insert_many(
[
{
"name": "Jane Doe",
"age": 42,
},
{
"nickname": "Bobby",
"color": "blue",
"foods": ["carrots", "chocolate"],
},
],
chunk_size=2,
concurrency=2,
ordered=False,
general_method_timeout_ms=1000,
)
import {
DataAPIClient,
CollectionInsertManyError,
UsernamePasswordTokenProvider,
} from "@datastax/astra-db-ts";
// Get an existing collection
const client = new DataAPIClient({ environment: "hcd" });
const database = client.db("API_ENDPOINT", {
token: new UsernamePasswordTokenProvider("USERNAME", "PASSWORD"),
});
const collection = database.collection("COLLECTION_NAME", {
keyspace: "KEYSPACE_NAME",
});
// Insert documents into the collection
(async function () {
try {
const result = await collection.insertMany(
[
{
name: "Jane Doe",
age: 42,
},
{
nickname: "Bobby",
color: "blue",
foods: ["carrots", "chocolate"],
},
],
{
chunkSize: 2,
concurrency: 2,
ordered: false,
},
);
} catch (error) {
if (error instanceof CollectionInsertManyError) {
console.log(error.insertedIds());
}
}
})();
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIClients;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.options.CollectionInsertManyOptions;
import com.datastax.astra.client.collections.commands.results.CollectionInsertManyResult;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.databases.Database;
import java.util.Arrays;
import java.util.List;
public class Example {
public static void main(String[] args) {
// Get an existing collection
DataAPIClient client = DataAPIClients.clientHCD("USERNAME", "PASSWORD");
Database database = client.getDatabase("API_ENDPOINT", "KEYSPACE_NAME");
Collection<Document> collection = database.getCollection("COLLECTION_NAME");
// Define the insertion options
CollectionInsertManyOptions options =
new CollectionInsertManyOptions().chunkSize(20).concurrency(3).ordered(false).timeout(1000);
// Insert documents into the collection
Document document1 = new Document().append("name", "Jane Doe").append("age", 42);
Document document2 =
new Document()
.append("nickname", "Bobby")
.append("color", "blue")
.append("foods", Arrays.asList("carrots", "chocolate"));
CollectionInsertManyResult result =
collection.insertMany(List.of(document1, document2), options);
System.out.println("IDs inserted: " + result.getInsertedIds());
}
}
curl -sS -L -X POST "API_ENDPOINT/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"name": "Jane Doe",
"age": 42
},
{
"nickname": "Bobby",
"color": "blue",
"foods": ["carrots", "chocolate"]
}
],
"options": {
"ordered": false
}
}
}'
Client reference
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the client reference.
For more information, see the client reference.
For more information, see the client reference.
Client reference documentation is not applicable for HTTP.