Documents reference
Documents represent a single row or record of data in Astra DB Serverless databases.
You use the Collection
class to work with documents through the Data API.
For instructions to get a Collection
object, see the Collections reference.
Prerequisites
-
Review the prerequisites and other information in the API reference overview.
-
If you haven’t done so already, create a Serverless (Vector) database.
-
If you use a Data API client, instantiate a
DataAPIClient
object and connect to your database.
Insert a single document
Insert a single document into a collection.
When you create a collection, you decide if the collection can store structured vector data. For vector-enabled collections, you also decide how to provide embeddings. You can either configure the collection to automatically generate embeddings with vectorize or provide embeddings when you load data (also known as bring your own embeddings). You must decide this when you create the collection.
When working with documents in the Astra Portal or Data API, there are two reserved fields for vector data:
-
The
$vector
parameter is a reserved field that stores vector arrays.-
If the collection requires that you bring your own embeddings, you can include this parameter when you load data.
-
If the collection uses vectorize, you don’t include
$vector
when you load data. Instead, Astra DB populates the$vector
field with the automatically generated embeddings.
Regardless of the embedding generation method, when you find, update, replace, or delete documents, you can use
$vector
to fetch documents by vector search. You can also use projections to include$vector
in responses. -
-
The
$vectorize
parameter is a reserved field that generates embeddings automatically based on a given text string.-
If the collection requires that you bring your own embeddings, you can not use this parameter.
-
If the collection uses vectorize, you must include this parameter when you load data. The value of
$vectorize
is the text string from which you want to generate a document’s embedding. Astra DB stores the resulting vector array in$vector
.
When you find, update, replace, or delete documents in a collection that uses vectorize, you can use
$vectorize
to fetch documents by vector search with vectorize. You can also use projections to include$vectorize
in responses. -
If you load a document that doesn’t need an embedding, then you can omit $vector
and $vectorize
.
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
insert_result = collection.insert_one({"name": "Jane Doe"})
Insert a document with an associated vector:
insert_result = collection.insert_one(
{
"name": "Jane Doe",
"$vector": [.08, .68, .30],
},
)
Insert a document and generate a vector automatically:
insert_result = collection.insert_one(
{
"name": "Jane Doe",
"$vectorize": "Text to vectorize",
},
)
Returns:
InsertOneResult
- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.
Example response
InsertOneResult(inserted_id='92b4c4f4-db44-4440-b4c4-f4db44e440b8', raw_results=...)
Parameters:
Name | Type | Summary |
---|---|---|
document |
|
The dictionary expressing the document to insert. The |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. If not passed, the collection-level setting is used instead. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
# Insert a document with a specific ID
response1 = collection.insert_one(
{
"_id": 101,
"name": "John Doe",
"$vector": [.12, .52, .32],
},
)
# Insert a document without specifying an ID
# so that _id
is generated automatically
response2 = collection.insert_one(
{
"name": "Jane Doe",
"$vector": [.08, .68, .30],
},
)
For more information, see the API reference.
const result = await collection.insertOne({ name: 'Jane Doe' });
Insert a document with an associated vector:
const result = await collection.insertOne({
name: 'Jane Doe',
$vector: [.08, .68, .30],
});
Insert a document and generate a vector automatically:
const result = await collection.insertOne({
name: 'Jane Doe',
$vectorize: 'Text to vectorize',
});
Parameters:
Name | Type | Summary |
---|---|---|
document |
The document to insert. If the document does not have an |
|
options? |
The options for this operation. |
Options (InsertOneOptions
):
Name | Type | Summary |
---|---|---|
|
The maximum time in milliseconds that the client should wait for the operation to complete. |
Returns:
Promise<InsertOneResult<Schema>>
- A promise that resolves
to the inserted ID.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert a document with a specific ID
await collection.insertOne({ _id: '1', name: 'John Doe' });
// Insert a document with an autogenerated ID
await collection.insertOne({ name: 'Jane Doe' });
// Insert a document with a vector
await collection.insertOne({ name: 'Jane Doe', $vector: [.12, .52, .32] });
})();
Operations on documents are performed at the Collection
level.
For more information, see the API reference.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
InsertOneResult insertOne(DOC document);
InsertOneResult insertOne(DOC document, float[] embeddings);
// Equivalent in asynchronous
CompletableFuture<InsertOneResult> insertOneAsync(DOC document);
CompletableFuture<InsertOneResult> insertOneAsync(DOC document, float[] embeddings);
Returns:
InsertOneResult
- Wrapper with the inserted document Id.
Parameters:
Name | Type | Summary |
---|---|---|
|
|
Object representing the document to insert.
The |
|
|
A vector of embeddings (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.InsertOneOptions;
import com.datastax.astra.client.model.InsertOneResult;
import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.AllArgsConstructor;
import lombok.Data;
public class InsertOne {
@Data @AllArgsConstructor
public static class Product {
@JsonProperty("_id")
private String id;
private String name;
}
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collectionDoc = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Insert a document
Document doc1 = new Document("1").append("name", "joe");
InsertOneResult res1 = collectionDoc.insertOne(doc1);
System.out.println(res1.getInsertedId()); // should be "1"
// Insert a document with embeddings
Document doc2 = new Document("2").append("name", "joe");
collectionDoc.insertOne(doc2, new float[] {.1f, .2f});
// Given an existing collection
Collection<Product> collectionProduct = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION2_NAME", Product.class);
// Insert a document with custom bean
collectionProduct.insertOne(new Product("1", "joe"));
collectionProduct.insertOne(new Product("2", "joe"), new float[] {.1f, .2f});
}
}
Insert a document with a predefined vector:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertOne": {
"document": {
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
"key1": "value1",
"key2": "value2"
}
}
}' | jq
Insert one and generate a vector automatically:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertOne": {
"document": {
"$vectorize": "Text to use to generate a vector",
"key1": "value1",
"key2": "value2"
}
}
}' | jq
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
Data API command to insert one document in a collection. |
|
object |
Contains the details of the record to add. With the exception of reserved fields (
|
|
reserved, multi-type |
An optional identifier for the document. If omitted, the server automatically generates a document ID. You can include identifiers in other fields as well. For more information, see Work with document IDs and The defaultId option. |
|
reserved array |
An optional reserved property used to store an array of numbers representing a vector embedding. Serverless (Vector) databases have specialized handling for vector data, including optimized query performance for similarity search.
|
|
reserved string |
An optional reserved property used to store a string that you want to use to automatically generate an embedding with vectorize.
|
Response
A successful response contains the _id
of the inserted document:
{
"status": {
"insertedIds": [
"12"
]
}
}
The insertedIds
content depends on the ID type and how it was generated, for example:
-
"insertedIds": [{"$objectId": "6672e1cbd7fabb4e5493916f"}]
-
`"insertedIds": [{"$uuid": "1ef2e42c-1fdb-6ad6-aae4-e84679831739"}]"
For more information, see Work with document IDs.
Examples:
Example with $vector
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertOne": {
"document": {
"purchase_type": "Online",
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
"customer": {
"name": "Jim A.",
"phone": "123-456-1111",
"age": 51,
"credit_score": 782,
"address": {
"address_line": "1234 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": { "$date": 1690045891 },
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [
{
"car": "BMW 330i Sedan",
"color": "Silver"
},
"Extended warranty - 5 years"
],
"amount": 47601,
"status": "active",
"preferred_customer": true
}
}
}' | jq
Example with $vectorize
curl --location 'REPLACE ME/api/json/v1/default_keyspace/REPLACE ME' \
--header 'Token: REPLACE ME' \
--header 'Content-Type: application/json' \
--header 'x-embedding-api-key;' \
--data '{
"insertOne": {
"document": {
"_id": "1",
"purchase_type": "Online",
"$vectorize": "Purchase of a silver BMW sedan in New York.",
"customer": {
"name": "Jim A.",
"phone": "123-456-1111",
"age": 51,
"credit_score": 782,
"address": {
"address_line": "1234 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": { "$date": 1690045891 },
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [
{
"car": "BMW 330i Sedan",
"color": "Silver"
},
"Extended warranty - 5 years"
],
"amount": 47601,
"status": "active",
"preferred_customer": true
}
}
}'
Work with dates
-
Python
-
TypeScript
-
Java
-
curl
Date and datetime objects are instances of the Python standard library datetime.datetime
and datetime.date
classes that you can use anywhere in documents.
The following example uses dates in insert
, update
, and find
commands.
Read operations from a collection always return the datetime
class, regardless of whether the original command used date
or datetime
.
import datetime
from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
# Insert documents containing date and datetime values:
collection.insert_one({"when": datetime.datetime.now()})
collection.insert_one({"date_of_birth": datetime.date(2000, 1, 1)})
collection.insert_one({"registered_at": datetime.date(1999, 11, 14)})
# Update a document, using a date in the filter:
collection.update_one(
{"registered_at": datetime.date(1999, 11, 14)},
{"$set": {"message": "happy Sunday!"}},
)
# Update a document, setting "last_reviewed" to the current date:
collection.update_one(
{"date_of_birth": {"$exists": True}},
{"$currentDate": {"last_reviewed": True}},
)
# Find documents by inequality on a date value:
print(
collection.find_one(
{"date_of_birth": {"$lt": datetime.date(2001, 1, 1)}},
projection={"_id": False},
)
)
# will print:
# {'date_of_birth': datetime.datetime(2000, 1, 1, 0, 0), 'last_reviewed': datetime.datetime(...now...)}
You can use standard JS Date
objects anywhere in documents to represent dates and times.
Read operations also return Date
objects for document fields stored using { $date: number }
.
The following example uses dates in insert
, update
, and find
commands:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
(async function () {
// Create an untyped collection
const collection = await db.createCollection('dates_test', { checkExists: false });
// Insert documents with some dates
await collection.insertOne({ dateOfBirth: new Date(1394104654000) });
await collection.insertOne({ dateOfBirth: new Date('1863-05-28') });
// Update a document with a date and setting lastModified to now
await collection.updateOne(
{
dateOfBirth: new Date('1863-05-28'),
},
{
$set: { message: 'Happy Birthday!' },
$currentDate: { lastModified: true },
},
);
// Will print around new Date()
const found = await collection.findOne({ dateOfBirth: { $lt: new Date('1900-01-01') } });
console.log(found?.lastModified);
})();
The Data API uses the ejson
standard to represents time-related objects.
The Java client introduces custom serializers as three types of objects: java.util.Date
, java.util.Calendar
, java.util.Instant
.
You can use these objects in documents as well as filter clauses and update clauses.
The following example uses dates in insert
, update
, and find
commands:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindOneOptions;
import com.datastax.astra.client.model.Projections;
import java.time.Instant;
import java.util.Calendar;
import java.util.Date;
import static com.datastax.astra.client.model.Filters.eq;
import static com.datastax.astra.client.model.Filters.lt;
import static com.datastax.astra.client.model.Updates.set;
public class WorkingWithDates {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
Calendar c = Calendar.getInstance();
collection.insertOne(new Document().append("registered_at", c));
collection.insertOne(new Document().append("date_of_birth", new Date()));
collection.insertOne(new Document().append("just_a_date", Instant.now()));
collection.updateOne(
eq("registered_at", c), // filter clause
set("message", "happy Sunday!")); // update clause
collection.findOne(
lt("date_of_birth", new Date(System.currentTimeMillis() - 1000 * 1000)),
new FindOneOptions().projection(Projections.exclude("_id")));
}
}
You can use $date
to represent dates as Unix timestamps in the JSON payload of a Data API command:
"date_of_birth": { "$date": 1690045891 }
The following example includes a date in an insertOne
command:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertOne": {
"document": {
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
"date_of_birth": { "$date": 1690045891 }
}
}
}' | jq
The following example uses the date to find and update a document with the updateOne
command:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"updateOne": {
"filter": {
"date_of_birth": { "$date": 1690045891 }
},
"update": { "$set": { "message": "Happy birthday!" } }
}
}' | jq
The following example uses the $currentDate
update operator to set a property to the current date:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndUpdate": {
"filter": { "_id": "doc1" },
"update": {
"$currentDate": {
"createdAt": true
}
}
}
}' | jq
Work with document IDs
Documents in a collection are always identified by an ID that is unique within the collection.
There are multiple types of document identifiers, such as string, integer, or datetime; however, the uuid
and ObjectId
types are recommended.
The Data API supports uuid
identifiers up to version 8 and ObjectId
identifiers as provided by the bson
library.
When you create a collection, you can set a default ID type that specifies how the Data API generates an _id
for any document that doesn’t have an explicit _id
field when you insert it into the collection.
However, if you provide an explicit _id
value, such as "_id": "12"
, then the server uses this value instead of generating an ID.
Regardless of the defaultId
setting, the Data API honors document identifiers of any type, anywhere in a document, that you explicitly provide at any time:
-
You can include identifiers anywhere in a document, not only in the
_id
field. -
You can include different types of identifiers in different parts of the same document.
-
You can define identifiers at any time, such as when inserting or updating a document.
-
You can use any of a document’s identifiers for filter clauses and update/replace operations, just like any other data type.
-
Python
-
TypeScript
-
Java
-
curl
AstraPy recognizes uuid
versions 1 and 3 through 8, as provided by the uuid
and uuid6
Python libraries.
AstraPy also recognizes the ObjectId
from the bson
package.
For convenience, these utilities are exposed in AstraPy directly:
from astrapy.ids import (
ObjectId,
uuid1,
uuid3,
uuid4,
uuid5,
uuid6,
uuid7,
uuid8,
UUID,
)
You can generate new identifiers with statements such as new_id = uuid8()
or new_obj_id = ObjectId()
:
from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_one({"_id": uuid8(), "tag": "new_id_v_8"})
collection.insert_one(
{"_id": UUID("018e77bc-648d-8795-a0e2-1cad0fdd53f5"), "tag": "id_v_8"}
)
collection.insert_one({"id": ObjectId(), "tag": "new_obj_id"})
collection.insert_one(
{"id": ObjectId("6601fb0f83ffc5f51ba22b88"), "tag": "obj_id"}
)
collection.find_one_and_update(
{"_id": ObjectId("6601fb0f83ffc5f51ba22b88")},
{"$set": {"item_inventory_id": UUID("1eeeaf80-e333-6613-b42f-f739b95106e6")}},
)
All uuid
versions are instances of the UUID
class, which exposes a version
property, if you need to access it.
To use and generate identifiers, astra-db-ts provides the UUID
and ObjectId
classes.
These are not the same as those exported from the bson
or uuid
libraries.
Instead, these are custom classes that you must import from the astra-db-ts
package:
import { UUID, ObjectId } from '@datastax/astra-db-ts';
To generate new identifiers, you can use UUID.v4()
, UUID.v7()
, or new ObjectId()
:
import { DataAPIClient, UUID, ObjectId } from '@datastax/astra-db-ts';
// Schema for the collection
interface Person {
_id: UUID | ObjectId;
name: string;
friendId?: UUID;
}
// Reference the DB instance
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
(async function () {
// Create the collection
const collection = await db.createCollection<Person>('people', { checkExists: false });
// Insert documents w/ various IDs
await collection.insertOne({ name: 'John', _id: UUID.v4() });
await collection.insertOne({ name: 'Jane', _id: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') });
await collection.insertOne({ name: 'Dan', _id: new ObjectId()});
await collection.insertOne({ name: 'Tim', _id: new ObjectId('65fd9b52d7fabba03349d013') });
// Update a document with a UUID in a non-_id field
await collection.updateOne(
{ name: 'John' },
{ $set: { friendId: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') } },
);
// Find a document by a UUID in a non-_id field
const john = await collection.findOne({ name: 'John' });
const jane = await collection.findOne({ _id: john!.friendId });
// Prints 'Jane 016b1cac-14ce-660e-8974-026c927b9b91 6'
console.log(jane?.name, jane?._id.toString(), (<UUID>jane?._id).version);
})();
All UUID methods return an instance of the same class, which exposes a version
property, if you need to access it.
UUIDs can also be constructed from a string representation of the IDs, if you want to use custom generation.
The Java client defines dedicated classes to support different implementations of UUID
, particularly v6 and v7.
When a unique identifier is retrieved from the server, it is returned as a uuid
, and then it is converted to the appropriate UUID
class, based on the class definition in the defaultId option.
ObjectId
classes are extracted from the BSON package, and they represent the ObjectId
type.
UUIDs from the Java UUID
class are implemented in the UUID v4 standard.
To generate new identifiers, you can use methods like new UUIDv6()
, new UUIDv7()
, or new ObjectId()
:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.ObjectId;
import com.datastax.astra.client.model.UUIDv6;
import com.datastax.astra.client.model.UUIDv7;
import java.time.Instant;
import java.util.UUID;
import static com.datastax.astra.client.model.Filters.eq;
import static com.datastax.astra.client.model.Updates.set;
public class WorkingWithDocumentIds {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Ids can be different Json scalar
// ('defaultId' options NOT set for collection)
new Document().id("abc");
new Document().id(123);
new Document().id(Instant.now());
// Working with UUIDv4
new Document().id(UUID.randomUUID());
// Working with UUIDv6
collection.insertOne(new Document().id(new UUIDv6()).append("tag", "new_id_v_6"));
UUID uuidv4 = UUID.fromString("018e77bc-648d-8795-a0e2-1cad0fdd53f5");
collection.insertOne(new Document().id(new UUIDv6(uuidv4)).append("tag", "id_v_8"));
// Working with UUIDv7
collection.insertOne(new Document().id(new UUIDv7()).append("tag", "new_id_v_7"));
// Working with ObjectIds
collection.insertOne(new Document().id(new ObjectId()).append("tag", "obj_id"));
collection.insertOne(new Document().id(new ObjectId("6601fb0f83ffc5f51ba22b88")).append("tag", "obj_id"));
collection.findOneAndUpdate(
eq((new ObjectId("6601fb0f83ffc5f51ba22b88"))),
set("item_inventory_id", UUID.fromString("1eeeaf80-e333-6613-b42f-f739b95106e6")));
}
}
When you insert a document, you can omit _id
to automatically generate an ID or you can manually specify an _id
, such as "_id": "12"
.
The following example inserts two documents with manually-defined _id
values.
One document uses the objectId
type, and the other uses the uuid
type.
"insertMany": {
"documents": [
{
"_id": { "$objectId": "6672e1cbd7fabb4e5493916f" },
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
"key": "value",
"amount": 53990
},
{
"_id": { "$uuid": "1ef2e42c-1fdb-6ad6-aae4-e84679831739" },
"$vector": [0.15, 0.1, 0.1, 0.35, 0.55],
"key": "value",
"amount": 4600
}
]
}
When you add or update a document, you can include additional identifiers in any document property, other than _id
, just as you would any other data type.
Insert many documents
Insert multiple documents into a collection.
When you create a collection, you decide if the collection can store structured vector data. For vector-enabled collections, you also decide how to provide embeddings. You can either configure the collection to automatically generate embeddings with vectorize or provide embeddings when you load data (also known as bring your own embeddings). You must decide this when you create the collection.
When working with documents in the Astra Portal or Data API, there are two reserved fields for vector data:
-
The
$vector
parameter is a reserved field that stores vector arrays.-
If the collection requires that you bring your own embeddings, you can include this parameter when you load data.
-
If the collection uses vectorize, you don’t include
$vector
when you load data. Instead, Astra DB populates the$vector
field with the automatically generated embeddings.
Regardless of the embedding generation method, when you find, update, replace, or delete documents, you can use
$vector
to fetch documents by vector search. You can also use projections to include$vector
in responses. -
-
The
$vectorize
parameter is a reserved field that generates embeddings automatically based on a given text string.-
If the collection requires that you bring your own embeddings, you can not use this parameter.
-
If the collection uses vectorize, you must include this parameter when you load data. The value of
$vectorize
is the text string from which you want to generate a document’s embedding. Astra DB stores the resulting vector array in$vector
.
When you find, update, replace, or delete documents in a collection that uses vectorize, you can use
$vectorize
to fetch documents by vector search with vectorize. You can also use projections to include$vectorize
in responses. -
If you load a document that doesn’t need an embedding, then you can omit $vector
and $vectorize
.
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Insert documents with vector embeddings:
response = collection.insert_many(
[
{
"_id": 101,
"name": "John Doe",
"$vector": [.12, .52, .32],
},
{
# ID is generated automatically
"name": "Jane Doe",
"$vector": [.08, .68, .30],
},
],
)
Insert multiple documents and generate vectors automatically:
response = collection.insert_many(
[
{
"name": "John Doe",
"$vectorize": "Text to vectorize for John Doe",
},
{
"name": "Jane Doe",
"$vectorize": "Text to vectorize for Jane Doe",
},
],
)
Returns:
InsertManyResult
- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.
Example response
InsertManyResult(inserted_ids=[101, '81077d86-05dc-43ca-877d-8605dce3ca4d'], raw_results=...)
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
documents |
|
An iterable of dictionaries, each a document to insert. Documents may specify their |
||
ordered |
|
If False (default), the insertions can occur in arbitrary order and possibly concurrently. If True, they are processed sequentially. If you don’t need ordered inserts, DataStax recommends setting this parameter to False for faster performance.
|
||
chunk_size |
|
How many documents to include in a single API request. The default is 50, and the maximum is 100. |
||
concurrency |
|
Maximum number of concurrent requests to the API at a given time. It cannot be more than one for ordered insertions. |
||
max_time_ms |
|
A timeout, in milliseconds, for the operation. If not passed, the collection-level setting is used instead: If you are inserting many documents, this method will require multiple HTTP requests. You may need to increase the timeout duration for the method to complete successfully. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many([{"a": 10}, {"a": 5}, {"b": [True, False, False]}])
collection.insert_many(
[{"seq": i} for i in range(50)],
concurrency=5,
)
collection.insert_many(
[
{"tag": "a", "$vector": [1, 2]},
{"tag": "b", "$vector": [3, 4]},
]
)
For more information, see the API reference.
Insert multiple documents with vectors:
const result = await collection.insertMany([
{
_id: '1',
name: 'John Doe',
$vector: [.12, .52, .32],
},
{
name: 'Jane Doe',
$vector: [.08, .68, .30],
},
], {
ordered: true,
});
Insert multiple documents and generate vectors automatically:
const result = await collection.insertMany([
{
name: 'John Doe',
$vectorize: 'Text to vectorize for John Doe',
},
{
name: 'Jane Doe',
$vectorize: 'Text to vectorize for Jane Doe',
},
], {
ordered: true,
});
Parameters:
Name | Type | Summary |
---|---|---|
documents |
The documents to insert. If any document does not have an |
|
options? |
The options for this operation. |
Options (InsertManyOptions
):
Name | Type | Summary | ||
---|---|---|---|---|
|
You may set the
|
|||
|
You can set the |
|||
|
Control how many documents are sent with each network request. The default is 50, and the maximum is 100. |
|||
|
The maximum time in milliseconds that the client should wait for the operation to complete. |
Returns:
Promise<InsertManyResult<Schema>>
- A promise that resolves to the inserted IDs.
Example:
import { DataAPIClient, InsertManyError } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
try {
// Insert many documents
await collection.insertMany([
{ _id: '1', name: 'John Doe' },
{ name: 'Jane Doe' }, // Will autogen ID
], { ordered: true });
// Insert many with vectors
await collection.insertMany([
{ name: 'John Doe', $vector: [.12, .52, .32] },
{ name: 'Jane Doe', $vector: [.32, .52, .12] },
]);
} catch (e) {
if (e instanceof InsertManyError) {
console.log(e.partialResult);
}
}
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
InsertManyResult insertMany(List<? extends DOC> documents);
InsertManyResult insertMany(List<? extends DOC> documents, InsertManyOptions options);
// Asynchronous
CompletableFuture<InsertManyResult> insertManyAsync(List<? extends DOC> docList);
CompletableFuture<InsertManyResult> insertManyAsync(List<? extends DOC> docList, InsertManyOptions options);
Returns:
InsertManyResult
- Wrapper with the list of inserted document ids.
Parameters:
Name | Type | Summary |
---|---|---|
|
|
A list of documents to insert.
Documents may specify their |
|
Set the different options for the insert operation. The options are The java operation As a best practice, try to always provide
The default value of If not provided the default values are |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.InsertManyOptions;
import com.datastax.astra.client.model.InsertManyResult;
import com.datastax.astra.client.model.InsertOneResult;
import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.AllArgsConstructor;
import lombok.Data;
import java.util.List;
public class InsertMany {
@Data @AllArgsConstructor
public static class Product {
@JsonProperty("_id")
private String id;
private String name;
}
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collectionDoc = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Insert a document
Document doc1 = new Document("1").append("name", "joe");
Document doc2 = new Document("2").append("name", "joe");
InsertManyResult res1 = collectionDoc.insertMany(List.of(doc1, doc2));
System.out.println("Identifiers inserted: " + res1.getInsertedIds());
// Given an existing collection
Collection<Product> collectionProduct = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION2_NAME", Product.class);
// Insert a document with embeddings
InsertManyOptions options = new InsertManyOptions()
.chunkSize(20) // how many process per request
.concurrency(1) // parallel processing
.ordered(false) // allows parallel processing
.timeout(1000); // timeout in millis
InsertManyResult res2 = collectionProduct.insertMany(
List.of(new Product("1", "joe"),
new Product("2", "joe")),
options);
}
}
With insertMany
, you provide an array of document objects.
The document objects have the same format as insertOne
.
The Data API accepts up to 100 documents per insertMany
request.
Insert multiple documents with vectors:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
"key1": "value1",
"key2": "value2"
},
{
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
"key1": "value3",
"key2": "value4"
},
{
"$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
"key1": "value3",
"key2": "value4"
},
]
"options": {
"ordered": false
}
}
}' | jq
Insert multiple documents and generate vectors automatically:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"$vectorize": "text to vectorize for first document",
"key1": "value1",
"key2": "value2"
},
{
"$vectorize": "text to vectorize for second document",
"key1": "value3",
"key2": "value4"
},
{
"$vectorize": "text to vectorize for third document",
"key1": "value3",
"key2": "value4"
},
]
"options": {
"ordered": false
}
}
}' | jq
Parameters:
Name | Type | Summary |
---|---|---|
insertMany |
command |
Data API command to insert multiple documents. You can insert up to 100 documents at a time. |
|
array of objects |
Contains the details of the records to add. It is an array of objects where each object represents a document. With the exception of reserved fields (
|
|
reserved multi-type |
An optional identifier for a document. If omitted, the server automatically generates a document ID. You can include identifiers in other fields as well. For more information, see Work with document IDs and The defaultId option. |
|
reserved array |
An optional reserved property used to store an array of numbers representing a vector embedding for a document. Serverless (Vector) databases have specialized handling for vector data, including optimized query performance for similarity search.
|
|
reserved string |
An optional reserved property used to store a string that you want to use to automatically generate an embedding for a document.
|
|
boolean |
If false, insertions occur in an arbitrary order with possible concurrency.
If true, insertions occur sequentially.
If you don’t need ordered inserts, DataStax recommends |
Response
A successful response contains the _id
of the inserted documents:
{
"status": {
"insertedIds": [
"4",
"7",
"10"
]
}
}
The insertedIds
content depends on the ID type and how it was generated, for example:
-
"insertedIds": [{"$objectId": "6672e1cbd7fabb4e5493916f"}]
-
`"insertedIds": [{"$uuid": "1ef2e42c-1fdb-6ad6-aae4-e84679831739"}]"
For more information, see Work with document IDs.
Example
The following insertMany
request adds three documents to a collection:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"insertMany": {
"documents": [
{
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
"customer": {
"name": "Jack B.",
"phone": "123-456-2222",
"age": 34,
"credit_score": 700,
"address": {
"address_line": "888 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": { "$date": 1690391491 },
"seller": {
"name": "Tammy S.",
"location": "Staten Island NYC"
},
"items": [
{
"car": "Tesla Model 3",
"color": "White"
},
"Extended warranty - 10 years",
"Service - 5 years"
],
"amount": 53990,
"status": "active"
},
{
"purchase_type": "Online",
"$vector": [0.15, 0.1, 0.1, 0.35, 0.55],
"customer": {
"name": "Jill D.",
"phone": "123-456-3333",
"age": 30,
"credit_score": 742,
"address": {
"address_line": "12345 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": { "$date": 1690564291 },
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": "Extended warranty - 10 years",
"amount": 4600,
"status": "active"
},
{
"purchase_type": "In Person",
"$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
"customer": {
"name": "Rachel I.",
"phone": null,
"age": 62,
"credit_score": 786,
"address": {
"address_line": "1234 Park Ave",
"city": "New York",
"state": "NY"
}
},
"purchase_date": { "$date": 1706202691 },
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [
{
"car": "BMW M440i Gran Coupe",
"color": "Silver"
},
"Extended warranty - 5 years",
"Gap Insurance - 5 years"
],
"amount": 65250,
"status": "active"
}
],
"options": {
"ordered": false
}
}
}' | jq
Find a document
Retrieve a single document from a collection using various filter and query options.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Retrieve a single document from a collection by its _id
:
document = collection.find_one({"_id": 101})
Retrieve a single document from a collection by any property, as long as the property is covered by the collection’s indexing configuration:
document = collection.find_one({"location": "warehouse_C"})
Retrieve a single document from a collection by an arbitrary filtering clause:
document = collection.find_one({"tag": {"$exists": True}})
Retrieve the document that is most similar to a given vector:
result = collection.find_one({}, sort={"$vector": [.12, .52, .32]})
Retrieve the most similar document by running a vector search with vectorize:
result = collection.find_one({}, sort={"$vectorize": "Text to vectorize"})
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
result = collection.find_one({"_id": 101}, projection={"name": True})
Returns:
Union[Dict[str, Any], None]
- Either the found document as a dictionary or None
if no matching document is found.
Example response
{'_id': 101, 'name': 'John Doe', '$vector': [0.12, 0.52, 0.32]}
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
projection |
|
Select a subset of fields to include in the response for each returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
include_similarity |
|
If true, the response includes a |
sort |
|
Use this dictionary parameter to perform a vector similarity search or set the order in which documents are returned.
For similarity searches, this parameter can use either |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.find_one()
# prints: {'_id': '68d1e515-...', 'seq': 37}
collection.find_one({"seq": 10})
# prints: {'_id': 'd560e217-...', 'seq': 10}
collection.find_one({"seq": 1011})
# (returns None for no matches)
collection.find_one(projection={"seq": False})
# prints: {'_id': '68d1e515-...'}
collection.find_one(
{},
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: {'_id': '97e85f81-...', 'seq': 69}
collection.find_one(sort={"$vector": [1, 0]}, projection={"*": True})
# prints: {'_id': '...', 'tag': 'D', '$vector': [4.0, 1.0]}
For more information, see the API reference.
Retrieve a single document from a collection by its _id
:
const doc = await collection.findOne({ _id: '101' });
Retrieve a single document from a collection by any property, as long as the property is covered by the collection’s indexing configuration:
const doc = await collection.findOne({ location: 'warehouse_C' });
Retrieve a single document from a collection by an arbitrary filtering clause:
const doc = await collection.findOne({ tag: { $exists: true } });
Retrieve the document that is most similar to a given vector:
const doc = await collection.findOne({}, { sort: { $vector: [.12, .52, .32] } });
Retrieve the most similar document by running a vector search with vectorize:
const doc = await collection.findOne({}, { sort: { $vectorize: 'Text to vectorize' } });
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
const doc = await collection.findOne({ _id: '101' }, { projection: { name: 1 } });
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the document to find. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
options? |
The options for this operation. |
Options (FindOneOptions
):
Name | Type | Summary |
---|---|---|
Specifies which fields to include or exclude in the returned documents. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. When specifying a projection, make sure that you handle the return type carefully. Consider type-casting. |
||
|
If true, the response includes a |
|
Perform a vector similarity search or set the order in which documents are returned.
For similarity searches, |
||
|
The maximum time in milliseconds that the client should wait for the operation to complete. |
Returns:
Promise<FoundDoc<Schema> | null>
- A promise that resolves
to the found document (inc. $similarity
if applicable), or null
if no matching document is found.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertMany([
{ name: 'John', age: 30, $vector: [1, 1, 1, 1, 1] },
{ name: 'Jane', age: 25, },
{ name: 'Dave', age: 40, },
]);
// Unpredictably prints one of their names
const unpredictable = await collection.findOne({});
console.log(unpredictable?.name);
// Failed find by name (null)
const failed = await collection.findOne({ name: 'Carrie' });
console.log(failed);
// Find by $gt age (Dave)
const dave = await collection.findOne({ age: { $gt: 30 } });
console.log(dave?.name);
// Find by sorting by age (Jane)
const jane = await collection.findOne({}, { sort: { age: 1 } });
console.log(jane?.name);
// Find by vector similarity (John, 1)
const john = await collection.findOne({}, { sort: { $vector: [1, 1, 1, 1, 1] }, includeSimilarity: true });
console.log(john?.name, john?.$similarity);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
Optional<T> findOne(Filter filter);
Optional<T> findOne(Filter filter, FindOneOptions options);
Optional<T> findById(Object id); // build the filter for you
// Asynchronous
CompletableFuture<Optional<DOC>> findOneAsync(Filter filter);
CompletableFuture<Optional<DOC>> findOneAsync(Filter filter, FindOneOptions options);
CompletableFuture<Optional<DOC>> findByIdAsync(Filter filter);
You can retrieve documents in various ways, for example:
-
Retrieve a single document from a collection by its
_id
. -
Retrieve a single document from a collection by any property, as long as the property is covered by the collection’s indexing configuration.
-
Retrieve a single document from a collection by an arbitrary filtering clause.
-
Retrieve the document that is most similar to a given vector.
-
Retrieve the most similar document by running a vector search with vectorize.
Additionally, you can use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
In the underlying HTTP request, a filter
is a JSON object containing filter and sort parameters, for example:
{
"findOne": {
"filter": {
"$and": [
{ "field2": { "$gt": 10 } },
{ "field3": { "$lt": 20 } },
{ "field4": { "$eq": "value" } }
]
},
"projection": {
"_id": 0,
"field": 1,
"field2": 1,
"field3": 1
},
"sort": {
"$vector": [0.25, 0.25, 0.25,0.25, 0.25]
},
"options": {
"includeSimilarity": true
}
}
}
You can define the preceding JSON object in Java as follows:
collection.findOne(
Filters.and(
Filters.gt("field2", 10),
Filters.lt("field3", 20),
Filters.eq("field4", "value")
),
new FindOneOptions()
.projection(Projections.include("field", "field2", "field3"))
.projection(Projections.exclude("_id"))
.vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f})
.includeSimilarity()
)
);
// with the import Static Magic
collection.findOne(
and(
gt("field2", 10),
lt("field3", 20),
eq("field4", "value")
),
vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f})
.projection(Projections.include("field", "field2", "field3"))
.projection(Projections.exclude("_id"))
.includeSimilarity()
);
Parameters:
Name | Type | Summary |
---|---|---|
|
|
Criteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
Set the different options for the
|
Returns:
Optional<T>
- Return the working document matching the filter or Optional.empty()
if no document is found.
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.DataAPIOptions;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOneOptions;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.and;
import static com.datastax.astra.client.model.Filters.eq;
import static com.datastax.astra.client.model.Filters.gt;
import static com.datastax.astra.client.model.Filters.lt;
import static com.datastax.astra.client.model.Projections.exclude;
import static com.datastax.astra.client.model.Projections.include;
public class FindOne {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Complete FindOne
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
FindOneOptions options = new FindOneOptions()
.projection(include("field", "field2", "field3"))
.projection(exclude("_id"))
.sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f})
.includeSimilarity();
Optional<Document> result = collection.findOne(filter, options);
// with the import Static Magic
collection.findOne(and(
gt("field2", 10),
lt("field3", 20),
eq("field4", "value")),
new FindOneOptions().sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f})
.projection(include("field", "field2", "field3"))
.projection(exclude("_id"))
.includeSimilarity()
);
// find one with a vectorize
collection.findOne(and(
gt("field2", 10),
lt("field3", 20),
eq("field4", "value")),
new FindOneOptions().sort("Life is too short to be living somebody else's dream.")
.projection(include("field", "field2", "field3"))
.projection(exclude("_id"))
.includeSimilarity()
);
collection.insertOne(new Document()
.append("field", "value")
.append("field2", 15)
.append("field3", 15)
.vectorize("Life is too short to be living somebody else's dream."));
}
}
Use the findOne
command to retrieve a document.
Retrieve a single document from a collection by its _id
:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOne": {
"filter": { "_id": "018e65c9-df45-7913-89f8-175f28bd7f74" }
}
}' | jq
Retrieve a single document from a collection by any property, as long as the property is covered by the collection’s indexing configuration:
"findOne": {
"filter": { "purchase_date": { "$date": 1690045891 } }
}
Retrieve a single document from a collection by an arbitrary filtering clause:
"findOne": {
"filter": { "preferred_customer": { "$exists": true } }
}
Retrieve the document that is most similar to a given vector:
"findOne": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] }
}
Retrieve the most similar document by running a vector search with vectorize:
"findOne": {
"sort": { "$vectorize": "I'd like some talking shoes" }
}
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
"findOne": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"projection": { "$vector": 1 }
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to retrieve a document in a collection based on one or more of |
|
object |
An object that defines filter criteria using the Data API filter syntax.
For example: |
|
object |
Perform a vector similarity search or set the order in which documents are returned.
For similarity searches, this parameter can use either |
|
object |
Select a subset of fields to include in the response for each returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
|
boolean |
If true, the response includes a
|
Returns:
A successful response includes a data
object that contains a document
object representing the document matching the given query.
The returned document
fields depend on the findOne
parameters, namely the projection
and options
.
"data": {
"document": {
"_id": "14"
}
}
Example
This request retrieves a document from a collection by its _id
with the default projection
and options
:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOne": {
"filter": { "_id": "14" }
}
}' | jq
The response contains the document’s _id
and all regular fields.
The default projection excludes $vector
and $vectorize
.
{
"data": {
"document": {
"_id": "14",
"amount": 110400,
"customer": {
"address": {
"address_line": "1414 14th Pl",
"city": "Brooklyn",
"state": "NY"
},
"age": 44,
"credit_score": 702,
"name": "Kris S.",
"phone": "123-456-1144"
},
"items": [
{
"car": "Tesla Model X",
"color": "White"
}
],
"purchase_date": {
"$date": 1698513091
},
"purchase_type": "In Person",
"seller": {
"location": "Brooklyn NYC",
"name": "Jasmine S."
}
}
}
}
Find documents using filtering options
Where you use findOne to fetch one document that matches a query, you use find
to fetch multiple documents that match a query.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find documents matching a property, as long as the property is covered by the collection’s indexing configuration:
doc_iterator = collection.find({"category": "house_appliance"}, limit=10)
Find documents matching a filter operator:
document = collection.find({"tag": {"$exists": True}}, limit=10)
Iterate over the documents most similar to a given vector:
doc_iterator = collection.find(
{},
sort={"$vector": [0.55, -0.40, 0.08]},
limit=5,
)
Iterate over similar documents by running a vector search with vectorize:
doc_iterator = collection.find(
{},
sort={"$vectorize": "Text to vectorize"},
limit=5,
)
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
result = collection.find({"category": "house_appliance"}, limit=10, projection={"name": True})
Returns:
Cursor
- A cursor for iterating over documents.
AstraPy cursors are compatible with for
loops, and they provide a few additional features.
However, for vector ANN search (with $vector
or $vectorize
), the response is a single page of up to 1000 documents, unless you set a lower limit
.
If you need to materialize a list of all results, you can use A cursor, while it is consumed, transitions between |
Example response
Cursor("some_collection", new, retrieved so far: 0)
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
projection |
|
Select a subset of fields to include in the response for each returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
skip |
|
Specify a number of documents to bypass (skip) before returning documents.
The first You can use this parameter only in conjunction with an explicit |
limit |
|
Limit the total number of documents returned.
Once |
include_similarity |
|
If true, the response includes a |
include_sort_vector |
|
If true, the response includes the You can’t use |
sort |
|
Use this dictionary parameter to perform a vector similarity search or set the order in which documents are returned.
For similarity searches, this parameter can use either |
max_time_ms |
|
A timeout, in milliseconds, for each underlying HTTP request used to fetch documents as you iterate over the cursor. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.COLLECTION
# Find all documents in the collection
# Not advisable if a very high number of matches is anticipated
for document in collection.find({}):
print(document)
# Find all documents in the collection with a specific field value
for document in collection.find({"a": 123}):
print(document)
# Find all documents in the collection matching a compound filter expression
matches = list(collection.find({
"$and": [
{"f1": 1},
{"f2": 2},
]
}))
# Same as the preceding example, but using the implicit AND operator
matches = list(collection.find({
"f1": 1,
"f2": 2,
}))
# Use the "less than" operator in the filter expression
matches2 = list(collection.find({
"$and": [
{"name": "John"},
{"price": {"$lt": 100}},
]
}))
# Run a $vectorize search, get back the query vector along with the documents
results_ite = collection.find(
{},
projection={"*": 1},
limit=3,
include_sort_vector=True,
sort={"$vectorize": "Query text"},
)
query = results_ite.get_sort_vector()
for doc in results_ite:
print(f"{doc['$vectorize']}: {doc['$vector'][:2]}... VS. {query[:2]}...")
For more information, see the API reference.
Find documents matching a property, as long as the property is covered by the collection’s indexing configuration:
const cursor = collection.find({ category: 'house_appliance' }, { limit: 10 });
Find documents matching a filter operator:
const cursor = collection.find({ category: 'house_appliance' }, { limit: 10 }, { tag: { $exists: true } });
Iterate over the documents most similar to a given vector:
const cursor = collection.find({}, { sort: { $vector: [0.55, -0.40, 0.08] }, limit: 5 });
Iterate over similar documents by running a vector search with vectorize:
const cursor = collection.find({}, { sort: { $vectorize: 'Text to vectorize' }, limit: 5 });
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
const cursor = collection.find({ category: 'house_appliance' }, { limit: 10 }, { projection: { name: 1 } });
Returns:
FindCursor<FoundDoc<Schema>>
- A cursor you can use to iterate over
the matching documents.
For vector ANN search (with $vector
or $vectorize
), the response is a single page of up to 1000 documents, unless you set a lower limit
.
If you need to materialize a list of all results, you can use A cursor, while it is consumed, transitions between |
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the documents to find. For a list of available operators, see Data API operators. |
|
options? |
The options for this operation. |
Options (FindOptions
):
Name | Type | Summary |
---|---|---|
Specifies which fields to include or exclude in the returned documents. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. When specifying a projection, make sure that you handle the return type carefully. Consider type-casting. |
||
|
If true, the response includes a |
|
|
If true, the response includes the You can’t use You can also access this through |
|
Perform a vector similarity search or set the order in which documents are returned.
For similarity searches, this parameter can use either |
||
|
Specify a number of documents to bypass (skip) before returning documents.
The first You can use this parameter only in conjunction with an explicit |
|
|
Limit the total number of documents returned in the lifetime of the cursor.
Once |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request as you iterate over the cursor. |
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertMany([
{ name: 'John', age: 30, $vector: [1, 1, 1, 1, 1] },
{ name: 'Jane', age: 25, },
{ name: 'Dave', age: 40, },
]);
// Gets all 3 in some order
const unpredictable = await collection.find({}).toArray();
console.log(unpredictable);
// Failed find by name ([])
const matchless = await collection.find({ name: 'Carrie' }).toArray();
console.log(matchless);
// Find by $gt age (John, Dave)
const gtAgeCursor = collection.find({ age: { $gt: 25 } });
for await (const doc of gtAgeCursor) {
console.log(doc.name);
}
// Find by sorting by age (Jane, John, Dave)
const sortedAgeCursor = collection.find({}, { sort: { age: 1 } });
await sortedAgeCursor.forEach(console.log);
// Find first by vector similarity (John, 1)
const john = await collection.find({}, { sort: { $vector: [1, 1, 1, 1, 1] }, includeSimilarity: true }).next();
console.log(john?.name, john?.$similarity);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
FindIterable<T> find(Filter filter, FindOptions options);
// Helper to build filter and options above ^
FindIterable<T> find(FindOptions options); // no filter
FindIterable<T> find(Filter filter); // default options
FindIterable<T> find(); // default options + no filters
FindIterable<T> find(float[] vector, int limit); // semantic search
FindIterable<T> find(Filter filter, float[] vector, int limit);
For more information, see Find a document and the API reference.
Returns:
FindIterable<T>
- A cursor that fetches up to the first 20 documents, and it can be iterated to fetch additional documents as needed.
However, for vector ANN search (with $vector
or $vectorize
), the response is a single page of up to 1000 documents, unless you set a lower limit
.
The The You can use the |
Parameters:
Name | Type | Summary |
---|---|---|
|
|
Criteria list to filter documents. The filter is a JSON object that can contain any valid Data API filter expression. For a list of available operators, see Data API operators. |
|
Set the different options for the
|
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.Sorts;
import static com.datastax.astra.client.model.Filters.lt;
import static com.datastax.astra.client.model.Projections.exclude;
import static com.datastax.astra.client.model.Projections.include;
public class Find {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Find Options
FindOptions options = new FindOptions()
.projection(include("field", "field2", "field3")) // select fields
.projection(exclude("_id")) // exclude some fields
.sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) // similarity vector
.skip(1) // skip first item
.limit(10) // stop after 10 items (max records)
.pageState("pageState") // used for pagination
.includeSimilarity(); // include similarity
// Execute a find operation
FindIterable<Document> result = collection.find(filter, options);
// Iterate over the result
for (Document document : result) {
System.out.println(document);
}
}
}
Use the find
command to retrieve multiple documents matching a query.
Retrieve documents by any property, as long as the property is covered by the collection’s indexing configuration:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": { "purchase_date": { "$date": 1690045891 } }
}
}' | jq
Retrieve documents matching a filter operator:
"find": {
"filter": { "preferred_customer": { "$exists": true } }
}
More filter operator examples
Match values that are equal to the filter value:
"find": {
"filter": {
"customer": {
"$eq": {
"name": "Jasmine S.",
"city": "Jersey City"
}
}
}
}
Match values that are not the filter value:
"find": {
"filter": {
"$not": {
"customer.address.state": "NJ"
}
}
}
You can use similar $not
operators for arrays, such as $nin
an $ne
.
Match any of the specified values in an array:
"find": {
"filter": {
"customer.address.city": {
"$in": [ "Jersey City", "Orange" ]
}
}
}
Match all in an array:
"find": {
"filter": {
"items": {
"$all": [
{
"car": "Sedan",
"color": "White"
},
"Extended warranty"
]
}
}
}
Compound and/or operators:
"find": {
"filter": {
"$and": [
{
"$or": [
{ "customer.address.city": "Jersey City" },
{ "customer.address.city": "Orange" }
]
},
{
"$or": [
{ "seller.name": "Jim A." },
{ "seller.name": "Tammy S." }
]
}
]
}
}
Compound range operators:
"find": {
"filter": {
"$and": [
{ "customer.credit_score": { "$gte": 700 } },
{ "customer.credit_score": { "$lt": 800 } }
]
}
}
Retrieve documents that are most similar to a given vector:
"find": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"options": {
"limit": 100
}
}
Retrieve similar documents by running a vector search with vectorize:
"find": {
"sort": { "$vectorize": "I'd like some talking shoes" },
"options": {
"limit": 100
}
}
Use a projection to specify the fields returned from each document.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
"find": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"projection": { "$vector": 1 },
"options": {
"includeSimilarity": true,
"limit": 100
}
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to retrieve multiple document in a collection based on one or more of |
|
object |
An object that defines filter criteria using the Data API filter syntax.
For example: |
|
object |
Perform a vector similarity search or set the order in which documents are returned.
For similarity searches, this parameter can use either |
|
object |
Select a subset of fields to include in the response for each returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
|
boolean |
If true, the response includes a
|
|
boolean |
If true, the response includes the
You can’t use |
|
integer |
Specify a number of documents to bypass (skip) before returning documents.
The first You can use this parameter only in conjunction with an explicit |
|
integer |
Limit the total number of documents returned.
Pagination can occur if more than 20 documents are returned in the current set of matching documents.
Once the |
Returns:
A successful response can include a data
object and a status
object:
-
The
data
object containsdocuments
, which is an array of objects. Each object represents a document matching the given query. The returned fields in each document object depend on thefindMany
parameters, namely theprojection
andoptions
.For vector ANN search (with
$vector
or$vectorize
), the response is a single page of up to 1000 documents, unless you set a lowerlimit
.For non-vector searches, pagination occurs if there are more than 20 matching documents, as indicated by the
nextPageState
key. If there are no more documents,nextPageState
isnull
or omitted. If there are more documents,nextPageState
contains an ID.{ "data": { "documents": [ { "_id": { "$uuid": "018e65c9-df45-7913-89f8-175f28bd7f74" } }, { "_id": { "$uuid": "018e65c9-e33d-749b-9386-e848739582f0" } } ], "nextPageState": null } }
In the event of pagination, you must issue a subsequent request with a
pageState
ID to fetch the next page of documents that matched the filter. As long as there is a subsequent page with matching documents, the transaction returns anextPageState
ID, which you use as thepageState
for the subsequent request. Each paginated request is exactly the same as the original request, except for the addition of thepageState
in theoptions
object:{ "find": { "filter": { "active_user": true }, "options": { "pageState": "NEXT_PAGE_STATE_FROM_PRIOR_RESPONSE" } } }
Continue issuing requests with the subsequent
pageState
ID until you have fetched all matching documents. -
The
status
object contains thesortVector
value if you setincludeSortVector
totrue
in the request:"status": { "sortVector": [0.4, 0.1, ...] }
Examples:
Example of simple property filter
This example uses a simple filter based on two document properties:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {
"customer.address.city": "Hoboken",
"customer.address.state": "NJ"
}
}
}' | jq
The response returned one matching document:
{
"data": {
"documents": [
{
"$vector": [
0.1,
0.15,
0.3,
0.12,
0.09
],
"_id": "17",
"amount": 54900,
"customer": {
"address": {
"address_line": "1234 Main St",
"city": "Hoboken",
"state": "NJ"
},
"age": 61,
"credit_score": 694,
"name": "Yolanda Z.",
"phone": "123-456-1177"
},
"items": [
{
"car": "Tesla Model 3",
"color": "Blue"
},
"Extended warranty - 5 years"
],
"purchase_date": {
"$date": 1702660291
},
"purchase_type": "Online",
"seller": {
"location": "Jersey City NJ",
"name": "Jim A."
},
"status": "active"
}
],
"nextPageState": null
}
}
Example of logical operators in a filter
This example uses the $and
and $or
logical operators to retrieve documents matching one condition from each $or
clause.
In this case, the customer.address.city
must be either Jersey City
or Orange
and the seller.name
must be either Jim A.
or Tammy S.
.
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {
"$and": [
{
"$or": [
{ "customer.address.city": "Jersey City" },
{ "customer.address.city": "Orange" }
]
},
{
"$or": [
{ "seller.name": "Jim A." },
{ "seller.name": "Tammy S." }
]
}
]
}
}
}' | jq
The response returned two matching documents:
{
"data": {
"documents": [
{
"$vector": [
0.3,
0.23,
0.15,
0.17,
0.4
],
"_id": "8",
"amount": 46900,
"customer": {
"address": {
"address_line": "1234 Main St",
"city": "Orange",
"state": "NJ"
},
"age": 29,
"credit_score": 710,
"name": "Harold S.",
"phone": "123-456-8888"
},
"items": [
{
"car": "BMW X3 SUV",
"color": "Black"
},
"Extended warranty - 5 years"
],
"purchase_date": {
"$date": 1693329091
},
"purchase_type": "In Person",
"seller": {
"location": "Staten Island NYC",
"name": "Tammy S."
},
"status": "active"
},
{
"$vector": [
0.25,
0.045,
0.38,
0.31,
0.67
],
"_id": "5",
"amount": 94990,
"customer": {
"address": {
"address_line": "32345 Main Ave",
"city": "Jersey City",
"state": "NJ"
},
"age": 50,
"credit_score": 800,
"name": "David C.",
"phone": "123-456-5555"
},
"items": [
{
"car": "Tesla Model S",
"color": "Red"
},
"Extended warranty - 5 years"
],
"purchase_date": {
"$date": 1690996291
},
"purchase_type": "Online",
"seller": {
"location": "Jersey City NJ",
"name": "Jim A."
},
"status": "active"
}
],
"nextPageState": null
}
}
Example values for sort operations
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
Data API commands, such as find
, findOne
, deleteOne
, updateOne
, and so on, can use sort
clauses to organize results based on similarity, or dissimilarity, to the given filter, such as a vector or field.
Additionally, you can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more specific information, examples, and parameters for operations that support sorting, see the explanations of the find
, update
, replace
, and delete
operations elsewhere on this page.
-
Python
-
TypeScript
-
Java
-
curl
|
When no particular order is required:
sort={} # (default when parameter not provided)
When sorting by a certain value in ascending/descending order:
from astrapy.constants import SortDocuments
sort={"field": SortDocuments.ASCENDING}
sort={"field": SortDocuments.DESCENDING}
When sorting first by "field" and then by "subfield"
(while modern Python versions preserve the order of dictionaries,
it is suggested for clarity to employ a collections.OrderedDict
in these cases):
sort={
"field": SortDocuments.ASCENDING,
"subfield": SortDocuments.ASCENDING,
}
When running a vector similarity (ANN) search based on a query vector, and then sorting by similarity:
sort={"$vector": [0.4, 0.15, -0.5]}
When running a vector similarity (ANN) search by generating a vector from text, and then sorting by similarity:
sort={"$vectorize": "Text to vectorize"}
Sort example
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
filter = {"seq": {"$exists": True}}
for doc in collection.find(filter, projection={"seq": True}, limit=5):
print(doc["seq"])
...
# will print e.g.:
# 37
# 35
# 10
# 36
# 27
cursor1 = collection.find(
{},
limit=4,
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
[doc["_id"] for doc in cursor1]
# prints: ['97e85f81-...', '1581efe4-...', '...', '...']
cursor2 = collection.find({}, limit=3)
cursor2.distinct("seq")
# prints: [37, 35, 10]
collection.insert_many([
{"tag": "A", "$vector": [4, 5]},
{"tag": "B", "$vector": [3, 4]},
{"tag": "C", "$vector": [3, 2]},
{"tag": "D", "$vector": [4, 1]},
{"tag": "E", "$vector": [2, 5]},
])
ann_tags = [
document["tag"]
for document in collection.find(
{},
sort={"$vector": [3, 3]},
limit=3,
)
]
ann_tags
# prints: ['A', 'B', 'C']
# (assuming the collection has metric VectorMetric.COSINE)
|
|
When no particular order is required:
{ sort: {} } // (default when parameter not provided)
When sorting by a certain value in ascending/descending order:
{ sort: { field: +1 } } // ascending
{ sort: { field: -1 } } // descending
When sorting first by "field" and then by "subfield" (order matters! ES2015+ guarantees string keys in order of insertion):
{ sort: { field: 1, subfield: 1 } }
Run a vector similarity (ANN) search based on a query vector:
{ sort: { $vector: [0.4, 0.15, -0.5] } }
Generate a vector to perform a vector similarity search. The collection must be associated with an embedding service.
{ sort: { $vectorize: "Text to vectorize" } }
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertMany([
{ name: 'Jane', age: 25, $vector: [1.0, 1.0, 1.0, 1.0, 1.0] },
{ name: 'Dave', age: 40, $vector: [0.4, 0.5, 0.6, 0.7, 0.8] },
{ name: 'Jack', age: 40, $vector: [0.1, 0.9, 0.0, 0.5, 0.7] },
]);
// Sort by age ascending, then by name descending (Jane, Jack, Dave)
const sorted1 = await collection.find({}, { sort: { age: 1, name: -1 } }).toArray();
console.log(sorted1.map(d => d.name));
// Sort by vector distance (Jane, Dave, Jack)
const sorted2 = await collection.find({}, { sort: { $vector: [1, 1, 1, 1, 1] } }).toArray();
console.log(sorted2.map(d => d.name));
})();
|
The sort()
operations are optional.
Use them only when needed.
Be aware of the order when chaining multiple sorts:
Sort s1 = Sorts.ascending("field1");
Sort s2 = Sorts.descending("field2");
FindOptions.Builder.sort(s1, s2);
You can use sort
to run a vector similarity (ANN) search:
FindOptions.Builder
.sort(new float[] {0.4f, 0.15f, -0.5f});
For collections that use vectorize, you can run a similarity search based on a vector generated from a text query:
FindOptions.Builder
.sort("Text to vectorize");
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.Sort;
import com.datastax.astra.client.model.Sorts;
import static com.datastax.astra.client.model.Filters.lt;
public class WorkingWithSorts {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Sort Clause for a vector
Sorts.vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f});;
// Sort Clause for other fields
Sort s1 = Sorts.ascending("field1");
Sort s2 = Sorts.descending("field2");
// Build the sort clause
new FindOptions().sort(s1, s2);
// Adding vector
new FindOptions().sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}, s1, s2);
}
}
|
When you run a find command, you can append nested JSON objects that define the search criteria (sort
or filter
), projection
, and other options
.
This example finds documents by performing a vector similarity search:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"projection": { "$vector": 1 },
"options": {
"includeSimilarity": true,
"includeSortVector": false,
"limit": 100
}
}
}' | jq
This request does the following:
-
sort
compares the given vector,[0.15, 0.1, 0.1, 0.35, 0.55]
, against the vectors for documents in the collection, and then returns results ranked by similarity. The$vector
key is a reserved property name for storing vector data. -
projection
requests that the response return the$vector
for each document. -
options.includeSimilarity
requests that the response include the$similarity
key with the numeric similarity score, which represents the closeness of thesort
vector and the document’s vector. -
options.includeSortVector
is set to false to exclude thesortVector
from the response. This is only relevant ifsort
includes either$vector
or$vectorize
and you want the response to include the sort vector. This is particularly useful with$vectorize
because you don’t know the sort vector in advance. -
options.limit
specifies the maximum number of documents to return. This example limits the entire list of matching documents to 100 documents or less.Vector search returns a single page of up to 1000 documents, unless you set a lower
limit
. Other searches (without$vector
or$vectorize
) return matching documents in batches of 20. Pagination occurs if there are more than 20 matching documents. For information about handling pagination, see Find documents using filtering options.
The projection
and options
settings can make the response more focused and potentially reduce the amount of data transferred.
Response
{
"data": {
"documents": [
{
"$similarity": 1,
"$vector": [
0.15,
0.1,
0.1,
0.35,
0.55
],
"_id": "3"
},
{
"$similarity": 0.9953563,
"$vector": [
0.15,
0.17,
0.15,
0.43,
0.55
],
"_id": "18"
},
{
"$similarity": 0.9732053,
"$vector": [
0.21,
0.22,
0.33,
0.44,
0.53
],
"_id": "21"
}
],
"nextPageState": null
}
}
Example values for projection operations
Certain document operations, such as findOne
, findMany
, findOneAndUpdate
, findOneAndReplace
, and findOneAndDelete
, support a projection
option that specifies which part of a document to return.
Typically, the projection specifies which fields to include or exclude.
If no projection, or an empty projection, is specified, the Data API applies a default projection.
This default projection includes, at minimum, the document identifier (_id
) and all regular fields, which are fields not prefixed by a dollar sign ($
).
If you specify a projection, all special fields, such as _id
, $vector
, and $vectorize
, have specific inclusion and exclusion defaults that you can override individually.
However, for regular fields, the projection must either include or exclude those fields.
The projection can’t include a mix of included and excluded regular fields.
If a projection includes fields that don’t exist in a returned document, then those fields are ignored for that document.
In order to optimize the response size and improve performance, DataStax recommends, when reading, to always providing an explicit projection tailored to the needs of the application. If an application relies on the presence of A quick, but possibly suboptimal, way to ensure the presence of special fields is to use the wildcard projection |
Projection syntax
A projection is expressed as a mapping of field names to boolean values.
Use true
mapping to include only the specified fields.
For example, the following true
mapping returns the document ID, field1
, and field2
:
{ "_id": true, "field1": true, "field2": true }
Alternatively, use a false
mapping to exclude the specified fields.
All other non-excluded fields are returned.
{ "field1": false, "field2": false }
The values in a projection map can be objects, booleans, decimals, or integers, but the Data API ultimately evaluates all of these as booleans.
For example, the following projection evaluates to true
(include) for all four fields:
{ "field1": true, "field2": 1, "field3": 90.0, "field4": { "keep": "yes!" } }
Whereas this project evaluates to false
(exclude) for all four fields:
{ "field1": false, "field2": 0, "field3": 0.0, "field4": {} }
Passing null-like types (such as {}
, null
or 0
) for the whole projection
mapping is equivalent to omitting projection
.
Projecting regular and special fields
For regular fields, a projection can’t mix include and exclude projections.
It can contain only true
or only false
values for regular fields.
For example, {"field1": true, "field2": false}
is an invalid projection that results in an API error.
However, the special fields _id
, $vector
, and $vectorize
have individual default inclusion and exclusion rules, regardless of the projection mapping.
Unlike regular fields, you can set the projection values for special fields independently of regular fields:
-
The
_id
field is included by default. You can opt to exclude it in atrue
mapping, such as{ "_id": false, "field1": true }
. -
The
$vector
and$vectorize
fields are excluded by default. You can opt to include these in afalse
mapping, such as{ "field1": false, "$vector": true }
. -
The
$similarity
key isn’t a document field, and you can’t use this key in a projection. The$similarity
value is the result of a vector ANN search operation with$vector
or$vectorize
. Use theincludeSimilarity
parameter to control the presence of$similarity
in the response.
Therefore, the following are all valid projections for regular and special fields:
{ "_id": true, "field1": true, "field2": true }
{ "_id": false, "field1": true, "field2": true }
{ "_id": false, "field1": false, "field2": false }
{ "_id": true, "field1": false, "field2": false }
{ "_id": true, "field1": true, "field2": true, "$vector": true }
{ "_id": true, "field1": true, "field2": true, "$vector": false }
{ "_id": false, "field1": true, "field2": true, "$vector": true }
{ "_id": false, "field1": true, "field2": true, "$vector": false }
{ "_id": false, "field1": false, "field2": false, "$vector": true }
{ "_id": false, "field1": false, "field2": false, "$vector": false }
{ "_id": true, "field1": false, "field2": false, "$vector": true }
{ "_id": true, "field1": false, "field2": false, "$vector": false }
The wildcard projection "*"
represents the whole of the document.
If you use this projection, it must be the only key in the projection.
If set to true ({ "*": true }
), all fields are returned.
If set to false ({ "*": false }
), no fields are returned, and each document is empty ({}
).
Projecting arrays and nested objects
For array fields, you can use a $slice
to specify which elements of the array to return.
Use one of the following formats:
// Return the first two elements
{ "arr": { "$slice": 2 } }
// Return the last two elements
{ "arr": { "$slice": -2 } }
// Skip 4 elements (from 0th index), return the next 2
{ "arr": { "$slice": [4, 2] } }
// Skip backward 4 elements (from the end), return next 2 elements (forward)
{ "arr": { "$slice": [-4, 2] } }
If a projection refers to a nested field, the keys in the subdocument are includes or excluded as requested. If you exclude all keys of an existing subdocument, then the document is returned with the subdocument present and an empty nested object.
Examples of nested document projections
Given the following document:
{
"_id": "z",
"a": {
"a1": 10,
"a2": 20
}
}
The results of various projections are as follows:
Projection | Result |
---|---|
|
|
|
|
|
|
|
|
|
|
Referencing overlapping paths or subpaths in a projection can create conflicting clauses and return an API error. For example, this projection is invalid:
// Invalid:
{ "a.a1": true, "a": true }
Projection examples by language
-
Python
-
TypeScript
-
Java
-
curl
For the Python client, the projection can be any of the following:
-
A dictionary (
Dict[str, Any]
) to include specific fields in the response, like{field_name: True}
. -
A dictionary (
Dict[str, Any]
) to exclude specific fields from the response, like{field_name: False}
. -
A list or other iterable over key names that are implied to be included in the projection.
For information about default projections and handling for special fields, see the preceding explanation of projection operations.
The following two projections are equivalent:
document = collection.find_one(
{"_id": 101},
projection={"name": True, "city": True},
)
document = collection.find_one(
{"_id": 101},
projection=["name", "city"],
)
The Typescript client takes in an untyped Plain Old JavaScript Object (POJO) for the projection
parameter.
The client also offers a StrictProjection<Schema>
type that provides full autocomplete and type checking for your document schema.
When specifying a projection, make sure that you handle the return type carefully. Consider type-casting.
import { StrictProjection } from '@datastax/astra-db-ts';
const doc = await collection.findOne({}, {
projection: {
'name': true,
'address.city': true,
},
});
interface MySchema {
name: string,
address: {
city: string,
state: string,
},
}
const doc = await collection.findOne({}, {
projection: {
'name': 1,
'address.city': 1,
// @ts-expect-error - `'address.car'` does not exist in type `StrictProjection<MySchema>`
'address.car': 0,
// @ts-expect-error - Type `{ $slice: number }` is not assignable to type `boolean | 0 | 1 | undefined`
'address.state': { $slice: 3 }
} satisfies StrictProjection<MySchema>,
});
For information about default projections and handling for special fields, see the preceding explanation of projection operations.
To support the projection mechanism, the Java client has different Options
classes that provide the projection
method in the helpers.
This method takes an array of Projection
classes with the field name and a boolean flag indicating inclusion or exclusion.
For information about default projections and handling for special fields, see the preceding explanation of projection operations.
Projection p1 = new Projection("field1", true);
Projection p2 = new Projection("field2", true);
FindOptions options1 = FindOptions.Builder.projection(p1, p2);
To simplify this syntax, you can use the Projections
syntactic sugar:
FindOptions options2 = FindOptions.Builder
.projection(Projections.include("field1", "field2"));
FindOptions options3 = FindOptions.Builder
.projection(Projections.exclude("field1", "field2"));
The Projection
class also provides a method to support $slice
for array fields:
// {"arr": {"$slice": 2}}
Projection sliceOnlyStart = Projections.slice("arr", 2, null);
// {"arr": {"$slice": [-4, 2]}}
Projection sliceOnlyRange =Projections.slice("arr", -4, 2);
// An you can use then freely in the different builders
FindOptions options4 = FindOptions.Builder
.projection(sliceOnlyStart);
In a curl request, include projection
as a find
parameter:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"projection": { "$vector": true, "name": true, "city": true }
"options": {
"includeSimilarity": true,
"includeSortVector": false,
"limit": 100
}
}
}' | jq
For information about default projections and handling for special fields, see the preceding explanation of projection operations.
Find and update a document
Find one document that matches a filter condition, apply changes to it, and then return the document itself.
This is effectively an expansion of the findOne
command with additional support for update operators and related options.
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then edit a property in that document:
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
)
Locate and update a document, returning the document itself, and create a new one if no match is found:
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
upsert=True,
)
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
collection.find_one_and_update(
{},
{"$set": {"best_match": True}},
sort={"$vector": [0.1, 0.2, 0.3]},
)
Returns:
Dict[str, Any]
- The document that was found, either before or after the update
(or a projection thereof, as requested). If no matches are found, None
is returned.
Example response
{'_id': 999, 'Marco': 'Polo'}
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
update |
|
The update prescription to apply to the document, expressed as a dictionary as per Data API syntax.
For example: |
projection |
|
See Find a document and Example values for projection operations. |
sort |
|
|
upsert |
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
return_document |
|
A flag controlling what document is returned.
If set to |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
)
# prints: {'_id': 'a80106f2-...', 'Marco': 'Polo'}
collection.find_one_and_update(
{"title": "Mr."},
{"$inc": {"rank": 3}},
projection={"title": True, "rank": True},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'a80106f2-...', 'title': 'Mr.', 'rank': 3}
collection.find_one_and_update(
{"name": "Johnny"},
{"$set": {"rank": 0}},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_update(
{"name": "Johnny"},
{"$set": {"rank": 0}},
upsert=True,
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'cb4ef2ab-...', 'name': 'Johnny', 'rank': 0}
For more information, see the API reference.
Find a document matching a filter condition, and then edit a property in that document:
const docBefore = await collection.findOneAndUpdate(
{ $and: [{ name: 'Jesse' }, { gender: 'M' }] },
{ $set: { title: 'Mr.' } },
);
Locate and update a document, returning the updated document, and create a new one if no match is found:
const docAfter = await collection.findOneAndUpdate(
{ $and: [{ name: 'Jesse' }, { gender: 'M' }] },
{ $set: { title: 'Mr.' } },
{ upsert: true, returnDocument: 'after' },
);
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
const docBefore = await collection.findOneAndUpdate(
{},
{ $set: { bestMatch: true } },
{ sort: { $vector: [0.1, 0.2, 0.3] } },
);
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the document to update. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
update |
The update to apply to the selected document. For a list of available operators, see Data API operators. |
|
options |
The options for this operation. |
Options (FindOneAndUpdateOptions
):
Name | Type | Summary |
---|---|---|
|
Specifies whether to return the original ( |
|
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
|
See Find a document and Example values for projection operations. |
||
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
|
|
When true, returns |
Returns:
Promise<WithId<Schema> | null>
- The document before/after
the update, depending on the type of returnDocument
, or null
if no matches are found.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert a document
await collection.insertOne({ 'Marco': 'Polo' });
// Prints 'Mr.'
const updated1 = await collection.findOneAndUpdate(
{ 'Marco': 'Polo' },
{ $set: { title: 'Mr.' } },
{ returnDocument: 'after' },
);
console.log(updated1?.title);
// Prints { _id: ..., title: 'Mr.', rank: 3 }
const updated2 = await collection.findOneAndUpdate(
{ title: 'Mr.' },
{ $inc: { rank: 3 } },
{ projection: { title: 1, rank: 1 }, returnDocument: 'after' },
);
console.log(updated2);
// Prints null
const updated3 = await collection.findOneAndUpdate(
{ name: 'Johnny' },
{ $set: { rank: 0 } },
{ returnDocument: 'after' },
);
console.log(updated3);
// Prints { _id: ..., name: 'Johnny', rank: 0 }
const updated4 = await collection.findOneAndUpdate(
{ name: 'Johnny' },
{ $set: { rank: 0 } },
{ upsert: true, returnDocument: 'after' },
);
console.log(updated4);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
Optional<T> findOneAndUpdate(Filter filter, Update update);
// Synchronous
CompletableFuture<Optional<T>> findOneAndUpdateAsync(Filter filter, Update update);
Returns:
Optional<T>
- Return the working document matching the filter or Optional.empty()
if no document is found.
Parameters:
Name | Type | Summary |
---|---|---|
|
Criteria list to filter the document.
The filter is a JSON object that can contain any valid Data API filter expression.
For a list of available operators, see Data API operators.
For examples and options, including |
|
|
The update prescription to apply to the document. For a list of available operators, see Data API operators. |
To build the different parts of the requests, a set of helper classes are provided
These are suffixed by an
|
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.Update;
import com.datastax.astra.client.model.Updates;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.lt;
public class FindOneAndUpdate {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Building the update
Update update = Updates.set("field1", "value1")
.inc("field2", 1d)
.unset("field3");
Optional<Document> doc = collection.findOneAndUpdate(filter, update);
}
}
Find a document matching a filter condition, and then edit a property in that document.
This example uses the $currentDate
update operator to set a property to the current date:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndUpdate": {
"filter": { "_id": "doc1" },
"update": {
"$currentDate": {
"createdAt": true
}
}
}
}' | jq
More update operator examples
Unset a property:
"findOneAndUpdate": {
"filter": {
"_id": "12"
},
"update": { "$unset": { "amount": "" } },
"options": { "returnDocument": "after" }
}
Increment a value:
"findOneAndUpdate": {
"filter": {
"_id": "12"
},
"update": { "$inc": { "counter": 1 } },
"options": { "returnDocument": "after" }
}
Add an element to a specific position in an array:
"findOneAndUpdate": {
"filter": {
"_id": "12"
},
"update": { "$push": { "tags": { "$each": [ "new1", "new2" ], "$position": 0 } } },
"options": { "returnDocument": "after" }
}
Rename a field:
"findOneAndUpdate": {
"filter": {
"_id": "12"
},
"update": { "$rename": { "old_field": "new_field", "other_old_field": "other_new_field" } },
"options": { "returnDocument": "after" }
}
Locate and update a document, returning the updated document, and create a new one if no match is found:
"findOneAndUpdate": {
"filter": {
"_id": "14"
},
"update": { "$set": { "min_col": 2, "max_col": 99 } },
"options": { "returnDocument": "after", "upsert": true }
}
If an upsert
occurs, use the $setOnInsert
operator to set additional document properties only for the new document:
"findOneAndUpdate": {
"filter": {
"_id": "27"
},
"update": {
"$currentDate": {
"field": true
},
"$setOnInsert": {
"customer.name": "James B."
}
},
"options": {
"returnDocument": "after",
"upsert": true
}
}
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
"findOneAndUpdate": {
"sort": {
"$vector": [0.1, 0.2, 0.3]
},
"update": {
"$set": {
"status": "active"
}
},
"options": {
"returnDocument": "after"
}
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
Data API command to find one document based on a query and then run an update operation on the document’s properties. |
|
object |
Search criteria to find the document to update. For a list of available operators, see Data API operators. For examples and parameters, see Find a document and Example values for sort operations. |
|
object |
The update prescription to apply to the document using Data API operators.
For example: |
|
object |
See Find a document and Example values for projection operations. |
|
boolean |
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
|
string |
A flag controlling what document is returned.
If set to |
Returns:
A successful response contains a data
object and a status
object:
-
The
data
object contains a singledocument
object representing either the original or modified document, based on thereturnDocument
parameter."data": { "document": { "_id": "5", "purchase_type": "Online", "$vector": [0.25, 0.045, 0.38, 0.31, 0.67], "customer": "David C.", "amount": 94990 } }
-
The
status
object contains thematchedCount
andmodifiedCount
fields, which indicate the number of documents that matched the filter and the number of documents that were modified, respectively. If theupdate
operation didn’t change any parameters in the matching document, then themodifiedCount
is0
."status": { "matchedCount": 1, "modifiedCount": 0 }
Update a document
updateOne
is similar to findOneAndUpdate
, except that the response includes only the result of the operation.
The response doesn’t include a document
object, and the request doesn’t support response-related parameters, such as projection
or returnDocument
.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then edit a property in that document:
update_result = collection.update_one(
{"_id": 456},
{"$set": {"name": "John Smith"}},
)
Locate and update a document or insert a new one if no match is found:
update_result = collection.update_one(
{"_id": 456},
{"$set": {"name": "John Smith"}},
upsert=True,
)
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
update_result = collection.update_one(
{},
{"$set": {"best_match": True}},
sort={"$vector": [0.1, 0.2, 0.3]},
)
Returns:
UpdateResult
- An object representing the response from the database after the update operation. It includes information about the operation.
Example response
UpdateResult(update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}, raw_results=...)
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
update |
|
The update prescription to apply to the document, expressed as a dictionary as per Data API syntax.
For example: |
sort |
|
|
upsert |
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.update_one({"Marco": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}, raw_results=...)
collection.update_one({"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}, raw_results=...)
collection.update_one(
{"Mirko": {"$exists": True}},
{"$inc": {"rank": 3}},
upsert=True,
)
# prints: UpdateResult(update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '2a45ff60-...'}, raw_results=...)
For more information, see the API reference.
Find a document matching a filter condition, and then edit a property in that document:
const result = await collection.updateOne(
{ $and: [{ name: 'Jesse' }, { gender: 'M' }] },
{ $set: { title: 'Mr.' } },
);
Locate and update a document or insert a new one if no match is found:
const result = await collection.updateOne(
{ $and: [{ name: 'Jesse' }, { gender: 'M' }] },
{ $set: { title: 'Mr.' } },
{ upsert: true },
);
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
const result = await collection.updateOne(
{},
{ $set: { bestMatch: true } },
{ sort: { $vector: [0.1, 0.2, 0.3] } },
);
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the document to update. For a list of available operators, see Data API operators. For additional examples, see Find a document. |
|
update |
The update to apply to the selected document. For examples and a list of available operators, see Find and update a document and Data API operators. |
|
options? |
The options for this operation. |
Options (UpdateOneOptions
):
Name | Type | Summary |
---|---|---|
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
Returns:
Promise<UpdateOneResult<Schema>>
- The result of the
update operation.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert a document
await collection.insertOne({ 'Marco': 'Polo' });
// Prints 1
const updated1 = await collection.updateOne(
{ 'Marco': 'Polo' },
{ $set: { title: 'Mr.' } },
);
console.log(updated1?.modifiedCount);
// Prints 0 0
const updated2 = await collection.updateOne(
{ name: 'Johnny' },
{ $set: { rank: 0 } },
);
console.log(updated2.matchedCount, updated2?.upsertedCount);
// Prints 0 1
const updated3 = await collection.updateOne(
{ name: 'Johnny' },
{ $set: { rank: 0 } },
{ upsert: true },
);
console.log(updated3.matchedCount, updated3?.upsertedCount);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
UpdateResult updateOne(Filter filter, Update update);
// Asynchronous
CompletableFuture<UpdateResult<T>> updateOneAsync(Filter filter, Update update);
Returns:
UpdateResults<T>
- Result of the operation with the number of documents matched (matchedCount
) and updated (modifiedCount
).
Parameters:
Name | Type | Summary |
---|---|---|
|
Criteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression. |
|
|
The update prescription to apply to the selected document. For examples and a list of available operators, see Find and update a document and Data API operators. |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.Update;
import com.datastax.astra.client.model.UpdateResult;
import com.datastax.astra.client.model.Updates;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.lt;
public class UpdateOne {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Building the update
Update update = Updates.set("field1", "value1")
.inc("field2", 1d)
.unset("field3");
UpdateResult result = collection.updateOne(filter, update);
}
Find a document matching a filter condition, and then edit a property in that document:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"updateOne": {
"filter": {
"_id": "14"
},
"update": { "$set": { "name": "Xiala" } }
}
}' | jq
Locate and update a document or insert a new one if no match is found:
"updateOne": {
"filter": {
"_id": "16"
},
"update": { "$set": { "name": "Serapio" } },
"options": { "upsert": true }
}
If an upsert
occurs, use the $setOnInsert
operator to assign additional properties to the new document:
"findOneAndUpdate": {
"filter": {
"_id": "16"
},
"update": {
"$currentDate": {
"field": true
},
"$setOnInsert": {
"customer.name": "James B."
}
},
"options": {
"upsert": true
}
}
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
"findOneAndUpdate": {
"sort": {
"$vector": [0.1, 0.2, 0.3]
},
"update": {
"$set": {
"status": "active"
}
}
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to updates a single document matching a query. |
|
object |
Used to select the document to be updated. For a list of available operators, see Data API operators. For examples and parameters, see Find a document and Example values for sort operations. |
|
object |
The update prescription to apply to the document.
For example: |
|
boolean |
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts a new document by applying the |
Returns:
The updateOne
command returns only the outcome of the operation, including the number of documents that matched the filter (matchedCount
) and the number of documents that were modified (modifiedCount
):
{
"status": {
"matchedCount": 1,
"modifiedCount": 1
}
}
Example
The following example uses the $set
update operator to set the value of a property (which uses the dot notation customer.name
) to a new value.
In this example, zodiac
can be a nested document or a property within the main document, and animal
is a property within zodiac
.
The operation intends to update the nested animal
field to lion
.
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"updateOne": {
"filter": {
"_id": "18"
},
"update": { "$set": { "zodiac.animal": "lion" } }
}
}' | jq
Update multiple documents
Use updateMany
to find and update multiple documents at once.
This command is a combination of find
and updateOne
.
However, updateMany
doesn’t support sort
operations.
Like updateOne
, the updateMany
response includes only the result of the operation.
The response doesn’t include a document
object, and the request doesn’t support response-related parameters, such as projection
or returnDocument
.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find documents matching a filter condition, and then edit a property in those documents:
results = collection.update_many(
{"name": {"$exists": False}},
{"$set": {"name": "unknown"}},
)
Locate and update multiple documents or insert a new one if no match is found:
results = collection.update_many(
{"name": {"$exists": False}},
{"$set": {"name": "unknown"}},
upsert=True,
)
For more examples, see Update a document.
Returns:
UpdateResult
- An object representing the response from the database after the update operation. It includes information about the operation.
Example response
UpdateResult(update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2}, raw_results=...)
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
update |
|
The update prescription to apply to the documents, expressed as a dictionary as per Data API syntax.
For example: |
upsert |
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts one new document by applying the |
max_time_ms |
|
A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default. You may need to increase the timeout duration when updating a large number of documents because the update requires multiple sequential HTTP requests. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many([{"c": "red"}, {"c": "green"}, {"c": "blue"}])
collection.update_many({"c": {"$ne": "green"}}, {"$set": {"nongreen": True}})
# prints: UpdateResult(update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2}, raw_results=...)
collection.update_many({"c": "orange"}, {"$set": {"is_also_fruit": True}})
# prints: UpdateResult(update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}, raw_results=...)
collection.update_many(
{"c": "orange"},
{"$set": {"is_also_fruit": True}},
upsert=True,
)
# prints: UpdateResult(update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '46643050-...'}, raw_results=...)
For more information, see the API reference.
Find documents matching a filter condition, and then edit a property in those documents:
const result = await collection.updateMany(
{ name: { $exists: false } },
{ $set: { title: 'unknown' } },
);
Locate and update multiple documents in a collection or insert a new one if no matches are found:
const result = await collection.updateMany(
{ name: { $exists: false } },
{ $set: { title: 'unknown' } },
{ upsert: true },
);
For more examples, see Update a document.
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the documents to update. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
update |
The update to apply to the selected documents. For examples and a list of available operators, see Find and update a document and Data API operators. |
|
options? |
The options for this operation. |
Options (UpdateManyOptions
):
Name | Type | Summary |
---|---|---|
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts one new document by applying the |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
Returns:
Promise<UpdateManyResult<Schema>>
- The result of the
update operation.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertMany([{ c: 'red' }, { c: 'green' }, { c: 'blue' }]);
// { modifiedCount: 2, matchedCount: 2, upsertedCount: 0 }
await collection.updateMany({ c: { $ne: 'green' } }, { $set: { nongreen: true } });
// { modifiedCount: 0, matchedCount: 0, upsertedCount: 0 }
await collection.updateMany({ c: 'orange' }, { $set: { is_also_fruit: true } });
// { modifiedCount: 0, matchedCount: 0, upsertedCount: 1, upsertedId: '...' }
await collection.updateMany({ c: 'orange' }, { $set: { is_also_fruit: true } }, { upsert: true });
})();
Operations on documents are performed at the Collection
level.
For more information, see the API reference.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
.
// Synchronous
UpdateResult updateMany(Filter filter, Update update);
UpdateResult updateMany(Filter filter, Update update, UpdateManyOptions);
// Synchronous
CompletableFuture<UpdateResult<T>> updateManyAsync(Filter filter, Update update);
CompletableFuture<UpdateResult<T>> updateManyAsync(Filter filter, Update update, UpdateManyOptions);
Returns:
UpdateResults<T>
- Result of the operation with the number of documents matched (matchedCount
) and updated (modifiedCount
)
Parameters:
Name | Type | Summary |
---|---|---|
|
Filters to select documents. This object can contain any valid Data API filter expression. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
|
The update prescription ot apply to the documents. For examples and a list of available operators, see Find and update a document and Data API operators. |
|
|
Contains the options for |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.Update;
import com.datastax.astra.client.model.UpdateManyOptions;
import com.datastax.astra.client.model.UpdateResult;
import com.datastax.astra.client.model.Updates;
import static com.datastax.astra.client.model.Filters.lt;
public class UpdateMany {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
Update update = Updates.set("field1", "value1")
.inc("field2", 1d)
.unset("field3");
UpdateManyOptions options =
new UpdateManyOptions().upsert(true);
UpdateResult result = collection.updateMany(filter, update, options);
}
}
Find documents matching a filter condition, and then edit a property in those documents:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"updateMany": {
"filter": { "status": "active" },
"update": { "$set": { "status": "inactive" } }
}
}' | jq
For more examples, see Update a document.
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to update multiple documents in a collection in a database. |
|
object |
Defines the criteria to selecting documents to update.
For example: |
|
object |
The update prescription to apply to the documents.
For example: |
|
boolean |
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts one new document by applying the |
Returns:
The updateMany
command returns the outcome of the operation, including the number of documents that matched the filter (matchedCount
) and the number of documents that were modified (modifiedCount
).
Pagination occurs if there are more than 20 matching documents.
In this case, the Count
values are capped at 20, and the moreData
flag is set to true
.
{
"status": {
"matchedCount": 20,
"modifiedCount": 20,
"moreData": true,
"nextPageState": "NEXT_PAGE_STATE_ID"
}
}
In the event of pagination, you must issue a subsequent request with a pageState
ID to update the next page of documents that matched the filter.
As long as there is a subsequent page with matching documents to update, the transaction returns a nextPageState
ID, which you use as the pageState
for the subsequent request.
Each paginated request is exactly the same as the original request, except for the addition of the pageState
in the options
object:
{
"updateMany": {
"filter": { "active_user": true },
"update": { "$set": { "new_data": "new_data_value" } },
"options": { "pageState": "*NEXT_PAGE_STATE_ID" }
}
}
Continue issuing requests with the subsequent pageState
ID until all matching documents have been updated.
Find distinct values across documents
Get a list of the distinct values of a certain key in a collection.
|
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
collection.distinct("category")
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
collection.distinct(
"food.allergies",
filter={"registered_for_dinner": True},
)
Returns:
List[Any]
- A list of the distinct values encountered. Documents that lack the requested key are ignored.
Example response
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]
Parameters:
Name | Type | Summary |
---|---|---|
key |
|
The name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable |
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
max_time_ms |
|
A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default. |
For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many(
[
{"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"},
{"name": "Emma", "food": {"likes_fruit": True, "allergies": []}},
]
)
collection.distinct("name")
# prints: ['Marco', 'Emma']
collection.distinct("city")
# prints: ['Helsinki']
collection.distinct("food")
# prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}]
collection.distinct("food.1")
# prints: ['orange']
collection.distinct("food.allergies")
# prints: []
collection.distinct("food.likes_fruit")
# prints: [True]
For more information, see the API reference.
const unique = await collection.distinct('category');
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
const unique = await collection.distinct(
'food.allergies',
{ registeredForDinner: true },
);
Parameters:
Name | Type | Summary |
---|---|---|
key |
|
The name of the field whose value is inspected across documents. Keys can use dot-notation to
descend to deeper document levels. Example of acceptable key values: |
filter? |
A filter to select the documents to use. If not provided, all documents will be used. |
Returns:
Promise<Flatten<(SomeDoc & ToDotNotation<FoundDoc<Schema>>)[Key]>[]>
- A promise which resolves to the
unique distinct values.
The return type is mostly accurate, but with complex keys, it may be required to manually cast the return type to the expected type.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertOne({ name: 'Marco', food: ['apple', 'orange'], city: 'Helsinki' });
await collection.insertOne({ name: 'Emma', food: { likes_fruit: true, allergies: [] } });
// ['Marco', 'Emma']
await collection.distinct('name')
// ['Helsinki']
await collection.distinct('city')
// ['apple', 'orange', { likes_fruit: true, allergies: [] }]
await collection.distinct('food')
// ['orange']
await collection.distinct('food.1')
// []
await collection.distinct('food.allergies')
// [true]
await collection.distinct('food.likes_fruit')
})();
Gets the distinct values of the specified field name.
// Synchronous
DistinctIterable<T,F> distinct(String fieldName, Filter filter, Class<F> resultClass);
DistinctIterable<T,F> distinct(String fieldName, Class<F> resultClass);
// Asynchronous
CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Filter filter, Class<F> resultClass);
CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Class<F> resultClass);
Returns:
DistinctIterable<F>
- List of distinct values of the specified field name.
Parameters:
Name | Type | Summary |
---|---|---|
|
|
The name of the field on which project the value. |
|
Criteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression. |
|
|
|
The type of the field we are working on |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.DistinctIterable;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindIterable;
import com.datastax.astra.client.model.FindOptions;
import static com.datastax.astra.client.model.Filters.lt;
import static com.datastax.astra.client.model.Projections.exclude;
import static com.datastax.astra.client.model.Projections.include;
public class Distinct {
public static void main(String[] args) {
// Given an existing collection
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Execute a find operation
DistinctIterable<Document, String> result = collection
.distinct("field", String.class);
DistinctIterable<Document, String> result2 = collection
.distinct("field", filter, String.class);
// Iterate over the result
for (String fieldValue : result) {
System.out.println(fieldValue);
}
}
}
This operation has no literal equivalent in HTTP.
Instead, you can use Find documents using filtering options, and then use jq
or another utility to extract _id
or other desired values from the response.
Count documents in a collection
Get the count of documents in a collection. Count all documents or apply filtering to count a subset of documents.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Count all documents in a collection up to the specified limit:
collection.count_documents({}, upper_bound=500)
Get the count of the documents in a collection matching a filter condition up to the specified limit:
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=50)
Returns:
int
- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.
Example response
320
This operation is suited to use cases where the number of documents to count is moderate.
Exact counting of an arbitrary number of documents is a slow, expensive operation that is not supported by the Data API.
If the count total exceeds the server-side threshold, an exception is raised.
If you need to count large numbers of documents, consider using |
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
upper_bound |
|
A required ceiling on the result of the count operation.
If the actual number of documents exceeds this value, an exception is raised.
An exception is also raised if the actual number of documents exceeds the maximum count that the Data API can reach, regardless of |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many([{"seq": i} for i in range(20)])
collection.count_documents({}, upper_bound=100)
# prints: 20
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=100)
# prints: 4
collection.count_documents({}, upper_bound=10)
# Raises: astrapy.exceptions.TooManyDocumentsToCountException
For more information, see the API reference.
const numDocs = await collection.countDocuments({}, 500);
Get the count of the documents in a collection matching a filter.
const numDocs = await collection.countDocuments({ seq: { $gt: 15 } }, 50);
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the documents to count. If not provided, all documents are counted. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
upperBound |
|
A required ceiling on the result of the count operation.
If the actual number of documents exceeds this value, an exception is raised.
An exception is also raised if the actual number of documents exceeds the maximum count that the Data API can reach, regardless of |
options? |
The options (the timeout) for this operation. |
Returns:
Promise<number>
- A promise that resolves to the exact count of the documents counted as requested, unless it exceeds
the caller-provided or API-set upper bound, in which case an exception is raised.
This operation is suited to use cases where the number of documents to count is moderate.
Exact counting of an arbitrary number of documents is a slow, expensive operation that is not supported by the Data API.
If the count total exceeds the server-side threshold, an exception is raised.
If you need to count large numbers of documents, consider using |
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some documents
await collection.insertMany(Array.from({ length: 20 }, (_, i) => ({ seq: i })));
// Prints 20
await collection.countDocuments({}, 100);
// Prints 4
await collection.countDocuments({ seq: { $gt: 15 } }, 100);
// Throws TooManyDocumentsToCountError
await collection.countDocuments({}, 10);
})();
Count all documents or get the count of the documents in a collection matching a condition:
// Synchronous
int countDocuments(int upperBound)
throws TooManyDocumentsToCountException;
int countDocuments(Filter filter, int upperBound)
throws TooManyDocumentsToCountException;
Parameters:
Name | Type | Summary |
---|---|---|
filter (optional) |
|
A filter to select documents to count.
For example: |
upperBound |
|
A required ceiling on the result of the count operation.
If the actual number of documents exceeds this value, an exception is raised.
An exception is also raised if the actual number of documents exceeds the maximum count that the Data API can reach, regardless of |
Returns:
int
- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.
The checked exception |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.exception.TooManyDocumentsToCountException;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import static com.datastax.astra.client.model.Filters.lt;
public class CountDocuments {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
try {
// Count with no filter
collection.countDocuments(500);
// Count with a filter
collection.countDocuments(filter, 500);
} catch(TooManyDocumentsToCountException tmde) {
// Explicit error if the count is above the upper limit or above the 1000 limit
}
}
}
Use the Data API countDocuments
command to obtain the exact count of documents in a collection:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{ "countDocuments": {} }' | jq
You can provide an optional filter condition to count only documents matching the filter:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"countDocuments": {
"filter": {
"year": { "$gt": 2000 }
}
}
}' | jq
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
A command to return an exact count of documents in a collection. |
|
object |
An optional filter to select the documents to count. If not provided, all documents are counted. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
Returns:
A successful response returns count
.
This is the exact count of the documents counted as requested, unless it exceeds the API-set upper bound, in which case the overflow is reported in the response by the moreData
flag.
Response within upper bound
{
"status": {
"count": 105
}
}
Response exceeding upper bound
{
"status": {
"moreData": true,
"count": 1000
}
}
This operation is suited to use cases where the number of documents to count is moderate.
Exact counting of an arbitrary number of documents is a slow, expensive operation that is not supported by the Data API.
If the count total exceeds the server-side threshold, the response includes If you need to count large numbers of documents, consider using |
Estimate document count in a collection
Get an approximate document count for an entire collection. Filtering isn’t supported. For the clients, you can set standard options, such as a timeout in milliseconds. There are no other options available.
In the estimatedDocumentCount
command’s response, the document count is based on current system statistics at the time the request is received by the database server.
Due to potential in-progress updates (document additions and deletions), the actual number of documents in the collection can be lower or higher in the database.
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Get an approximate document count for a collection:
collection.estimated_document_count()
Returns:
int
- A server-side estimate of the total number of documents in the collection.
Example response
37500
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.collection
collection.estimated_document_count()
For more information, see the API reference.
Get an approximate document count for a collection:
const estNumDocs = await collection.estimatedDocumentCount();
Returns:
Promise<number>
- A promise that resolves to a server-side estimate of the total number of documents in the collection.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
console.log(await collection.estimatedDocumentCount());
})();
For more information, see the API reference.
Get an approximate document count for a collection:
long estimatedDocumentCount();
long estimatedDocumentCount(EstimatedCountDocumentsOptions options);
Returns:
long
- A server-side estimate of the total number of documents in the collection. This estimate is built from the SSTable files.
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.exception.TooManyDocumentsToCountException;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.EstimatedCountDocumentsOptions;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.internal.command.LoggingCommandObserver;
import static com.datastax.astra.client.model.Filters.lt;
public class EstimateCountDocuments {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Count with no filter
long estimatedCount = collection.estimatedDocumentCount();
// Count with options (adding a logger)
EstimatedCountDocumentsOptions options = new EstimatedCountDocumentsOptions()
.registerObserver("logger", new LoggingCommandObserver(DataAPIClient.class));
long estimateCount2 = collection.estimatedDocumentCount(options);
}
}
Use the estimatedDocumentCount
command to get an approximate document count for a collection:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{ "estimatedDocumentCount": {} }' | jq
Returns:
A successful request returns count
, which is an estimate of the total number of documents in the collection:
{ "status": { "count": 37500 } }
Find and replace a document
Find one document that matches a filter condition, replace it with a new document, and then return the document itself. This command is similar to Find and update a document.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then replace the matching document with the given replacement:
collection.find_one_and_replace(
{"_id": "rule1"}, # filter
{"text": "some animals are more equal!"}, # replacement
)
Locate and replace a document, returning the document itself, and create a new one if no match is found:
collection.find_one_and_replace(
{"_id": "rule1"},
{"text": "some animals are more equal!"},
upsert=True,
)
Locate and replace the document most similar to a query vector from either $vector
or $vectorize
.
In this example, the filter object is empty, and only the sort object is used to locate the document to replace.
Including the empty filter object ensures that the replacement object is read correctly.
collection.find_one_and_replace(
{}, # empty filter
{"name": "Zoo", "desc": "the new best match"}, # replacement
sort={"$vector": [0.1, 0.2, 0.3]}, # sort object, to locate the document to replace
)
Returns:
Dict[str, Any]
- Either the original or the replaced document.
The exact fields returned depend on the projection
parameter.
If you request the original document, and there are no matches, then None
is returned.
Example response
{'_id': 'rule1', 'text': 'all animals are equal'}
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
||
replacement |
|
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
||
projection |
|
See Find a document and Example values for projection operations. |
||
sort |
|
|||
upsert |
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts the |
||
return_document |
|
A flag controlling what document is returned.
If set to |
||
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
import astrapy
collection.insert_one({"_id": "rule1", "text": "all animals are equal"})
collection.find_one_and_replace(
{"_id": "rule1"},
{"text": "some animals are more equal!"},
)
# prints: {'_id': 'rule1', 'text': 'all animals are equal'}
collection.find_one_and_replace(
{"text": "some animals are more equal!"},
{"text": "and the pigs are the rulers"},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'rule1', 'text': 'and the pigs are the rulers'}
collection.find_one_and_replace(
{"_id": "rule2"},
{"text": "F=ma^2"},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_replace(
{"_id": "rule2"},
{"text": "F=ma"},
upsert=True,
return_document=astrapy.constants.ReturnDocument.AFTER,
projection={"_id": False},
)
# prints: {'text': 'F=ma'}
For more information, see the API reference.
Find a document matching a filter condition, and then replace the matching document with the given replacement:
const docBefore = await collection.findOneAndReplace(
{ _id: 123 }, // filter
{ text: 'some animals are more equal!' }, // replacement
);
Locate and replace a document, returning the document itself, and creating a new one if no match is found:
const docBefore = await collection.findOneAndReplace(
{ _id: 123 },
{ text: 'some animals are more equal!' },
{ upsert: true },
);
Locate and replace the document most similar to a query vector from either $vector
or $vectorize
.
In this example, the filter object is empty, and only the sort object is used to locate the document to replace.
Including the empty filter object ensures that the replacement object is read correctly.
const docBefore = await collection.findOneAndReplace(
{}, // empty filter
{ name: 'Zoe', desc: 'The new best match' }, // replacement
{ sort: { $vector: [0.1, 0.2, 0.3] } }, // sort object, to locate the document to replace
);
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
A filter to select the document to replace. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|||
replacement |
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
|||
options |
The options for this operation. |
Options (FindOneAndReplaceOptions
):
Name | Type | Summary |
---|---|---|
|
Specifies whether to return the original ( |
|
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts the |
|
See Find a document and Example values for projection operations. |
||
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
|
|
When true, returns |
Returns:
Promise<WithId<Schema> | null>
- The document before/after
the update, depending on the type of returnDocument
, or null
if no matches are found.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.insertOne({ _id: 'rule1', text: 'all animals are equal' });
// { _id: 'rule1', text: 'all animals are equal' }
await collection.findOneAndReplace(
{ _id: 'rule1' },
{ text: 'some animals are more equal!' },
{ returnDocument: 'before' }
);
// { _id: 'rule1', text: 'and the pigs are the rulers' }
await collection.findOneAndReplace(
{ text: 'some animals are more equal!' },
{ text: 'and the pigs are the rulers' },
{ returnDocument: 'after' }
);
// null
await collection.findOneAndReplace(
{ _id: 'rule2' },
{ text: 'F=ma^2' },
{ returnDocument: 'after' }
);
// { text: 'F=ma' }
await collection.findOneAndReplace(
{ _id: 'rule2' },
{ text: 'F=ma' },
{ upsert: true, returnDocument: 'after', projection: { _id: false } }
);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
Optional<T> findOneAndReplace(Filter filter, T replacement);
Optional<T> findOneAndReplace(Filter filter, T replacement, FindOneAndReplaceOptions options);
// Asynchronous
CompletableFuture<Optional<T>> findOneAndReplaceAsync(Filter filter, T replacement);
CompletableFuture<Optional<T>> findOneAndReplaceAsync(Filter filter, T replacement, FindOneAndReplaceOptions options);
Returns:
Optional<T>
- Return the a document that matches the filter. Whether returnDocument
is set to before or after it will return the document before or after update accordingly.
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter (optional) |
|
Filter criteria to find the document to replace.
The filter is a JSON object that can contain any valid Data API filter expression.
For a list of available operators, see Data API operators.
For examples and options, including |
||
replacement |
|
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
||
options (optional) |
Set the different options for the find and replace operation, including the following:
|
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOneAndReplaceOptions;
import com.datastax.astra.client.model.Projections;
import com.datastax.astra.client.model.Sorts;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.lt;
public class FindOneAndReplace {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
FindOneAndReplaceOptions options = new FindOneAndReplaceOptions()
.projection(Projections.include("field1"))
.sort(Sorts.ascending("field1"))
.upsert(true)
.returnDocumentAfter();
Document docForReplacement = new Document()
.append("field1", "value1")
.append("field2", 20)
.append("field3", 30)
.append("field4", "value4");
// It will return the document before deleting it
Optional<Document> docBeforeReplace = collection
.findOneAndReplace(filter, docForReplacement, options);
}
}
Example with sort
and projection
:
FindOneAndReplaceOptions options = FindOneAndReplaceOptions.Builder
.projection(Projections.include("field1"))
.sort(Sorts.ascending("field1"))
.upsert(true)
.returnDocumentAfter();
|
Find a document matching a filter condition, and then replace that document with a new one:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndReplace": {
"filter": { "_id": "14" },
"replacement": { "customer": { "name": "Ann Jones" }, "account": { "status": "inactive } }
}
}' | jq
Locate and replace a document or insert a new one if no match is found:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndReplace": {
"filter": { "_id": "16" },
"replacement": { "customer": { "name": "Ann Jones" }, "account": { "status": "inactive } },
"options": { "upsert": true }
}
}' | jq
Locate and replace the document most similar to a query vector from either $vector
or $vectorize
:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndReplace": {
"sort": { "$vector": [0.1, 0.2, 0.3] },
"replacement": { "customer": { "name": "Ann Jones" }, "account": { "status": "inactive } },
"projection": { "$vector": 1 },
"options": { "returnDocument": "after" }
}
}' | jq
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
|
command |
The Data API command to find and replace one document in a collection based on |
||
|
object |
Search criteria to find the document to replace. For a list of available operators, see Data API operators. For examples and parameters, see Find a document and Example values for sort operations. |
||
|
object |
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
||
|
object |
Select a subset of fields to include in the response for the returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
||
|
boolean |
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts the |
||
|
string |
A flag controlling what document is returned.
If set to |
Returns:
A successful response returns an object representing the original or replacement document, based on the returnDocument
and projection
options.
Replace a document
Find one document that matches a filter condition, and then replace it with a new document.
replaceOne
is similar to findOneAndReplace
, except that the response includes only the result of the operation.
The response doesn’t include a document
object, and the request doesn’t support response-related parameters, such as projection
or returnDocument
.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then replace the matching document with the given replacement:
replace_result = collection.replace_one(
{"Marco": {"$exists": True}}, # filter
{"Buda": "Pest"}, # replacement
)
Locate and replace a document or create a new one if no match is found:
replace_result = collection.replace_one(
{"Marco": {"$exists": True}},
{"Buda": "Pest"},
upsert=True,
)
Locate and replace the document most similar to a query vector from either $vector
or $vectorize
.
In this example, the filter object is empty, and only the sort object is used to locate the document to replace.
Including the empty filter object ensures that the replacement object is read correctly.
collection.replace_one(
{}, # empty filter
{"name": "Zoo", "desc": "the new best match"}, # replacement
sort={"$vector": [0.1, 0.2, 0.3]}, # sort object, to locate the document to replace
)
Returns:
UpdateResult
- An object representing the response from the database after the replace operation. It includes information about the operation.
Example response
UpdateResult(update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}, raw_results=...)
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
||
replacement |
|
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
||
sort |
|
|||
upsert |
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts the |
||
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.replace_one({"Marco": {"$exists": True}}, {"Buda": "Pest"})
# prints: UpdateResult(update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}, raw_results=...)
collection.find_one({"Buda": "Pest"})
# prints: {'_id': '8424905a-...', 'Buda': 'Pest'}
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"})
# prints: UpdateResult(update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}, raw_results=...)
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}, upsert=True)
# prints: UpdateResult(update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '931b47d6-...'}, raw_results=...)
For more information, see the API reference.
Find a document matching a filter condition, and then replace the matching document with the given replacement:
const result = await collection.replaceOne(
{ 'Marco': 'Polo' }, // filter
{ 'Buda': 'Pest' }, // replacement
);
Locate and replace a document or create a new one if no match is found:
const result = await collection.replaceOne(
{ 'Marco': 'Polo' },
{ 'Buda': 'Pest' },
{ upsert: true },
);
Locate and replace the document most similar to a query vector from either $vector
or $vectorize
.
In this example, the filter object is empty, and only the sort object is used to locate the document to replace.
Including the empty filter object ensures that the replacement object is read correctly.
const result = await collection.replaceOne(
{}, // empty filter
{ name: "Zoe", desc: "The new best match" }, // replacement
{ sort: { $vector: [0.1, 0.2, 0.3] } }, // sort object, to locate the document to replace
);
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
A filter to select the document to replace. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|||
replacement |
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
|||
options? |
The options for this operation. |
Options (ReplaceOneOptions
):
Name | Type | Summary |
---|---|---|
|
This parameter controls the behavior if there are no matches.
If true and there are no matches, then the operation inserts the |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
Returns:
Promise<ReplaceOneResult<Schema>>
- The result of the
replacement operation.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.insertOne({ 'Marco': 'Polo' });
// { modifiedCount: 1, matchedCount: 1, upsertedCount: 0 }
await collection.replaceOne(
{ 'Marco': { '$exists': true } },
{ 'Buda': 'Pest' }
);
// { _id: '3756ce75-aaf1-430d-96ce-75aaf1730dd3', Buda: 'Pest' }
await collection.findOne({ 'Buda': 'Pest' });
// { modifiedCount: 0, matchedCount: 0, upsertedCount: 0 }
await collection.replaceOne(
{ 'Mirco': { '$exists': true } },
{ 'Oh': 'yeah?' }
);
// { modifiedCount: 0, matchedCount: 0, upsertedId: '...', upsertedCount: 1 }
await collection.replaceOne(
{ 'Mirco': { '$exists': true } },
{ 'Oh': 'yeah?' },
{ upsert: true }
);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
UpdateResult replaceOne(Filter filter, T replacement);
UpdateResult replaceOne(Filter filter, T replacement, ReplaceOneOptions options);
// Asynchronous
CompletableFuture<UpdateResult> replaceOneAsync(Filter filter, T replacement);
CompletableFuture<UpdateResult> replaceOneAsync(Filter filter, T replacement, ReplaceOneOptions options);
Returns:
UpdateResult - Return a wrapper object with the result of the operation. The object contains the number of documents matched (matchedCount
) and updated (modifiedCount
)
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter (optional) |
|
Filter criteria to find the document to replace.
The filter is a JSON object that can contain any valid Data API filter expression.
For a list of available operators, see Data API operators.
For examples and options, including |
||
replacement |
|
The new document to write into the collection.
Define all fields that the replacement document must include, except for the
|
||
options(optional) |
Set the different options for the
|
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOneAndReplaceOptions;
import com.datastax.astra.client.model.Projections;
import com.datastax.astra.client.model.Sorts;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.lt;
public class FindOneAndReplace {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
FindOneAndReplaceOptions options = new FindOneAndReplaceOptions()
.projection(Projections.include("field1"))
.sort(Sorts.ascending("field1"))
.upsert(true)
.returnDocumentAfter();
Document docForReplacement = new Document()
.append("field1", "value1")
.append("field2", 20)
.append("field3", 30)
.append("field4", "value4");
// It will return the document before deleting it
Optional<Document> docBeforeReplace = collection
.findOneAndReplace(filter, docForReplacement, options);
}
}
This operation has no literal equivalent in HTTP.
Instead, you can use Find and replace a document with "projection": {"*": false}
, which excludes all document
fields from the response.
Find and delete a document
Find one document that matches a filter condition, delete it, and then return the deleted document.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then delete it:
deleted_document = collection.find_one_and_delete({"status": "stale_entry"})
Locate and delete the document most similar to a query vector from either $vector
or $vectorize
:
deleted_document = collection.find_one_and_delete(
{},
sort={"$vector": [0.1, 0.2, 0.3]},
)
Returns:
Dict[str, Any]
- The deleted document or, if no matches are found, None
.
The exact fields returned depend on the projection
parameter.
Example response
{'_id': 199, 'status': 'stale_entry', 'request_id': 'A4431'}
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
projection |
|
See Find a document and Example values for projection operations. |
sort |
|
|
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many(
[
{"species": "swan", "class": "Aves"},
{"species": "frog", "class": "Amphibia"},
],
)
collection.find_one_and_delete(
{"species": {"$ne": "frog"}},
projection={"species": True},
)
# prints: {'_id': '5997fb48-...', 'species': 'swan'}
collection.find_one_and_delete({"species": {"$ne": "frog"}})
# (returns None for no matches)
For more information, see the API reference.
Find a document matching a filter condition, and then delete it:
const deletedDoc = await collection.findOneAndDelete({ status: 'stale_entry' });
Locate and delete the document most similar to a query vector from either $vector
or $vectorize
:
const deletedDoc = await collection.findOneAndDelete(
{},
{ sort: { $vector: [0.1, 0.2, 0.3] } },
);
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the document to delete. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
options? |
The options for this operation. |
Options (FindOneAndDeleteOptions
):
Name | Type | Summary |
---|---|---|
See Find a document and Example values for projection operations. |
||
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
|
|
When true, returns |
Returns:
Promise<WithId<Schema> | null>
- The deleted document, or, if no matches are found, null
.
The exact fields returned depend on the projection
parameter.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.insertMany([
{ species: 'swan', class: 'Aves' },
{ species: 'frog', class: 'Amphibia' },
]);
// { _id: '...', species: 'swan' }
await collection.findOneAndDelete(
{ species: { $ne: 'frog' } },
{ projection: { species: 1 } },
);
// null
await collection.findOneAndDelete(
{ species: { $ne: 'frog' } },
);
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
Optional<T> findOneAndDelete(Filter filter);
Optional<T> findOneAndDelete(Filter filter, FindOneAndDeleteOptions options);
// Asynchronous
CompletableFuture<Optional<T>> findOneAndDeleteAsync(Filter filter);
CompletableFuture<Optional<T>> findOneAndDeleteAsync(Filter filter, FindOneAndDeleteOptions options);
Returns:
DeleteResult
- Wrapper that contains the deleted count.
Parameters:
Name | Type | Summary |
---|---|---|
filter (optional) |
|
Filter criteria to find the document to delete.
The filter is a JSON object that can contain any valid Data API filter expression.
For a list of available operators, see Data API operators.
For examples and options, including |
options (optional) |
Set the different options for the find and delete operation, including the following:
|
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import java.util.Optional;
import static com.datastax.astra.client.model.Filters.lt;
public class FindOneAndDelete {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Building a filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// It will return the document before deleting it
Optional<Document> docBeforeRelease = collection.findOneAndDelete(filter);
}
}
Find a document matching a filter condition, and then delete it:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"findOneAndDelete": {
"filter": {
"customer.name": "Fred Smith",
"_id": "13"
}
}
}' | jq
Locate and delete the document most similar to a query vector from either $vector
or $vectorize
:
"findOneAndDelete": {
"sort": { "$vector": [0.1, 0.2, 0.3] },
"projection": { "$vector": 1 }
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to find and delete the first document in a collection that matches the given |
|
object |
Search criteria to find the document to delete. For a list of available operators, see Data API operators. For examples and parameters, see Find a document and Example values for sort operations. |
|
object |
Select a subset of fields to include in the response for the returned document. If empty or unset, the default projection is used. The default projection doesn’t always include all document fields. For more information and examples, see Example values for projection operations. |
Response:
A successful response incudes data
and status
objects.
-
The
data
object can contain the deleted document, based on theprojection
parameter, if a matching document was found and deleted. -
The
status
object contains the number of deleted documents. ForfindOneAndDelete
, this is either1
(one document deleted) or0
(no matches).
{
"status": {
"deletedCount": 1
}
}
Delete a document
Locate and delete one document in a collection.
deleteOne
is similar to findOneAndDelete
, except that the response includes only the result of the operation.
The response doesn’t include a document
object, and the request doesn’t support response-related parameters, such as projection
or returnDocument
.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find a document matching a filter condition, and then delete it:
# Find by ID
response = collection.delete_one({ "_id": "1" })
# Find by a document property
document = collection.delete_one({"location": "warehouse_C"})
# Find with a filter operator
document = collection.delete_one({"tag": {"$exists": True}})
Locate and delete the document most similar to a query vector from either $vector
or $vectorize
:
# Find by vector search with $vector
result = collection.delete_one({}, sort={"$vector": [.12, .52, .32]})
# Find by vector search with $vectorize
result = collection.delete_one({}, sort={"$vectorize": "Text to vectorize"})
Returns:
DeleteResult
- An object representing the response from the database after the delete operation. It includes information about the success of the operation.
Example response
DeleteResult(deleted_count=1, raw_results=...)
Parameters:
Name | Type | Summary |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example: |
sort |
|
|
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])
collection.delete_one({"seq": 1})
# prints: DeleteResult(deleted_count=1, raw_results=...)
collection.distinct("seq")
# prints: [0, 2]
collection.delete_one(
{"seq": {"$exists": True}},
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: DeleteResult(deleted_count=1, raw_results=...)
collection.distinct("seq")
# prints: [0]
collection.delete_one({"seq": 2})
# prints: DeleteResult(deleted_count=0, raw_results=...)
For more information, see the API reference.
Find a document matching a filter condition, and then delete it:
// Find by ID
const result = await collection.deleteOne({ _id: '1' });
// Find by a document property
const result = await collection.deleteOne({ location: 'warehouse_C' });
// Find with a filter operator
const result = await collection.deleteOne({ tag: { $exists: true } });
Locate and delete the document most similar to a query vector from either $vector
or $vectorize
:
// Find by vector search with $vector
const result = await collection.deleteOne({}, { sort: { $vector: [.12, .52, .32] } });
// Find by vector search with $vectorize
const result = await collection.deleteOne({}, { sort: { $vectorize: 'Text to vectorize' } });
Parameters:
Name | Type | Summary |
---|---|---|
filter |
A filter to select the document to delete. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options. |
|
options? |
The options for this operation. |
Options (DeleteOneOptions
):
Name | Type | Summary |
---|---|---|
|
The maximum time in milliseconds that the client should wait for the operation to complete each underlying HTTP request. |
Returns:
Promise<DeleteOneResult>
- The result of the deletion operation.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.insertMany([{ seq: 1 }, { seq: 0 }, { seq: 2 }]);
// { deletedCount: 1 }
await collection.deleteOne({ seq: 1 });
// [0, 2]
await collection.distinct('seq');
// { deletedCount: 1 }
await collection.deleteOne({ seq: { $exists: true } }, { sort: { seq: -1 } });
// [0]
await collection.distinct('seq');
// { deletedCount: 0 }
await collection.deleteOne({ seq: 2 });
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
DeleteResult deleteOne(Filter filter);
DeleteResult deleteOne(Filter filter, DeleteOneOptions options);
// Asynchronous
CompletableFuture<DeleteResult> deleteOneAsync(Filter filter);
CompletableFuture<DeleteResult> deleteOneAsync(Filter filter, DeleteOneOptions options);
Returns:
DeleteResult
- Wrapper that contains the deleted count.
Parameters:
Name | Type | Summary |
---|---|---|
filter (optional) |
|
Filter criteria to find the document to delete.
The filter is a JSON object that can contain any valid Data API filter expression.
For a list of available operators, see Data API operators.
For examples and options, including |
options (optional) |
Set the different options for the |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.DeleteOneOptions;
import com.datastax.astra.client.model.DeleteResult;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.Sorts;
import static com.datastax.astra.client.model.Filters.lt;
public class DeleteOne {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Sample Filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
// Delete one options
DeleteOneOptions options = new DeleteOneOptions()
.sort(Sorts.ascending("field2"));
DeleteResult result = collection.deleteOne(filter, options);
System.out.println("Deleted Count:" + result.getDeletedCount());
}
}
Find a document matching a filter condition, and then delete it:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"deleteOne": {
"filter": {
"tags": "first"
}
}
}' | jq
Locate and update the document most similar to a query vector from either $vector
or $vectorize
:
"deleteOne": {
"sort": { "$vector": [0.1, 0.2, 0.3] }
}
Parameters:
Name | Type | Summary |
---|---|---|
|
command |
The Data API command to find and delete the first document in a collection that matches the given |
|
object |
Search criteria to find the document to delete. For a list of available operators, see Data API operators. For examples and parameters, see Find a document and Example values for sort operations. |
Response:
A successful response returns the number of deleted documents.
For deleteOne
, this is either 1
(one document deleted) or 0
(no matches).
{
"status": {
"deletedCount": 1
}
}
Delete documents
Delete all documents in a collection that match a given filter condition. If you supply an empty filter, then the operation deletes every document in the collection.
This operation doesn’t support sort
conditions.
Sort and filter operations can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort or filter queries. |
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
Find documents in a collection that match a given filter, and then delete them:
delete_result = collection.delete_many({"status": "processed"})
An empty filter deletes all documents and completely empties the collection:
|
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax.
For example:
|
||
max_time_ms |
|
The timeout, in milliseconds, for the entire delete operation. This method uses the collection-level timeout by default. |
Returns:
DeleteResult
- An object representing the response from the database after the delete operation. It includes information about the success of the operation.
A response of deleted_count=-1
indicates that every document in the collection was deleted.
Example response
DeleteResult(deleted_count=2, raw_results=...)
The time required for the delete operation depends on the number of documents that match the filter. To delete a large number of documents, this operation issues multiple sequential HTTP requests until all matching documents are deleted. You might need to increase the timeout parameter to allow enough time for all underlying HTTP requests. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = client.get_database("A*PI_ENDPOINT")
collection = database.my_collection
collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])
collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=2)
collection.distinct("seq")
# prints: [2]
collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=0)
# An empty filter deletes all documents and completely empties the collection:
collection.delete_many({})
# prints: DeleteResult(raw_results=..., deleted_count=-1)
For more information, see the API reference.
Find documents in a collection that match a given filter, and then delete them:
const result = await collection.deleteMany({ status: 'processed' });
An empty filter deletes all documents and completely empties the collection:
|
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter |
A filter to select the documents to delete. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options.
|
|||
options? |
The timeout, in milliseconds, for the entire delete operation. |
Returns:
Promise<DeleteManyResult>
- The result of the
deletion operation.
A deleted count of -1
indicates that every document in the collection was deleted.
The time required for the delete operation depends on the number of documents that match the filter. To delete a large number of documents, this operation issues multiple sequential HTTP requests until all matching documents are deleted. You might need to increase the timeout parameter to allow enough time for all underlying HTTP requests. |
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.insertMany([{ seq: 1 }, { seq: 0 }, { seq: 2 }]);
// { deletedCount: 1 }
await collection.deleteMany({ seq: { $lte: 1 } });
// [2]
await collection.distinct('seq');
// { deletedCount: 0 }
await collection.deleteMany({ seq: { $lte: 1 } });
// { deletedCount: -1 }
await collection.deleteMany({});
})();
Operations on documents are performed at the Collection
level.
Collection is a generic class with the default type of Document
.
You can specify your own type, and the object is serialized by Jackson.
For more information, see the API reference.
Most methods have synchronous and asynchronous flavors, where the asynchronous version is suffixed by Async
and returns a CompletableFuture
:
// Synchronous
DeleteResult deleteMany(Filter filter);
// Asynchronous
CompletableFuture<DeleteResult> deleteManyAsync(Filter filter);
An empty filter deletes all documents and completely empties the collection:
|
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
filter (optional) |
|
Filter criteria to find the documents to delete. The filter is a JSON object that can contain any valid Data API filter expression. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options.
|
Returns:
DeleteResult
- Wrapper that contains the deleted count.
The time required for the delete operation depends on the number of documents that match the filter. To delete a large number of documents, this operation iterates over batches of documents until all matching documents are deleted.
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.DeleteResult;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import static com.datastax.astra.client.model.Filters.lt;
public class DeleteMany {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Sample Filter
Filter filter = Filters.and(
Filters.gt("field2", 10),
lt("field3", 20),
Filters.eq("field4", "value"));
DeleteResult result = collection.deleteMany(filter);
System.out.println("Deleted Count:" + result.getDeletedCount());
}
}
Find documents in a collection that match a filter condition, and then delete them:
curl -sS --location -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_NAMESPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"deleteMany": {
"filter": {
"status": "inactive"
}
}
}' | jq
An empty
|
Parameters:
Name | Type | Summary | ||
---|---|---|---|---|
|
command |
The Data API command to delete all matching documents from a collection based on the provided filter criteria. |
||
|
object |
A filter to select the documents to delete. For a list of available operators, see Data API operators. For additional examples, see Find documents using filtering options.
|
Response:
A successful response returns the result of the delete operation.
This operation deletes up to 20 documents at a time.
If the deletedCount
is 20
, there might be more matching documents to delete.
{
"status": {
"deletedCount": 20
}
}
To delete another batch of documents, reissue the same request.
Continue issuing the deleteMany
request until the deletedCount
is less than 20.
Example of batch deletion
For this example, assume that you send the following deleteMany
command and the server finds 30 matching documents:
{
"deleteMany": {
"filter": { "a": true }
}
}
The server deletes the first 20 documents and then returns the following response:
{
"status": {
"moreData": true,
"deletedCount": 20
}
}
The server doesn’t tell you explicitly how many matches were found.
However, the deletedCount
of 20
indicates there could be more matching documents to delete.
To delete the next batch of documents, reissue the same deleteMany
command:
{
"deleteMany": {
"filter": { "a": true }
}
}
This time, the server returns the following:
{
"status": {
"deletedCount": 10
}
}
Because the deletedCount
is less than 20
, this indicates that all matching documents were deleted.
To confirm, you can reissue the deleteMany
request and get a deletedCount
of 0
:
{
"status": {
"deletedCount": 0
}
}
A deleted count of -1
indicates that every document in the collection was deleted.
This occurs if you pass an empty filter
or deleteMany
object.
In this case, you don’t need to delete documents in batches.
With an empty filter, the server automatically iterates over batches of documents until all documents are deleted.
{
"status": {
"deletedCount": -1
}
}
Execute multiple write operations
Execute a (reusable) list of write operations on a collection with a single command.
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the API reference.
bw_results = collection.bulk_write(
[
InsertMany([{"a": 1}, {"a": 2}]),
ReplaceOne(
{"z": 9},
replacement={"z": 9, "replaced": True},
upsert=True,
),
],
)
Returns:
BulkWriteResult
- A single object summarizing the whole list of requested operations. The keys in the map attributes of the result (when present) are the integer indices of the corresponding operation in the requests
iterable.
Example response
BulkWriteResult(deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'}, bulk_api_results=...)
Parameters:
Name | Type | Summary |
---|---|---|
requests |
|
An iterable over concrete subclasses of |
ordered |
|
Whether to launch the |
concurrency |
|
Maximum number of concurrent operations executing at a given time. It cannot be more than one for ordered bulk writes. |
max_time_ms |
|
A timeout, in milliseconds, for the whole bulk write. This method uses the collection-level timeout by default. You may need to increase the timeout duration depending on the number of operations. If the method call times out, there’s no guarantee about how much of the bulk write was completed. |
Example:
from astrapy import DataAPIClient
from astrapy.operations import (
InsertOne,
InsertMany,
UpdateOne,
UpdateMany,
ReplaceOne,
DeleteOne,
DeleteMany,
)
client = DataAPIClient("TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.my_collection
op1 = InsertMany([{"a": 1}, {"a": 2}])
op2 = ReplaceOne({"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True)
collection.bulk_write([op1, op2])
# prints: BulkWriteResult(deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'}, bulk_api_results=...)
collection.count_documents({}, upper_bound=100)
# prints: 3
collection.distinct("replaced")
# prints: [True]
For more information, see the API reference.
const results = await collection.bulkWrite([
{ insertOne: { a: '1' } },
{ insertOne: { a: '2' } },
{ replaceOne: { z: '9' }, replacement: { z: '9', replaced: true }, upsert: true },
]);
Parameters:
Name | Type | Summary |
---|---|---|
operations |
The operations to perform. |
|
options? |
The options for this operation. |
Options (BulkWriteOptions
):
Name | Type | Summary |
---|---|---|
|
You may set the |
|
|
You can set the Not available for ordered operations. |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete. |
Returns:
Promise<BulkWriteResult<Schema>>
- A promise that resolves
to a summary of the performed operations.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const client = new DataAPIClient('TOKEN');
const db = client.db('ENDPOINT', { namespace: 'NAMESPACE' });
const collection = db.collection('COLLECTION');
(async function () {
// Insert some document
await collection.bulkWrite([
{ insertOne: { document: { a: 1 } } },
{ insertOne: { document: { a: 2 } } },
{ replaceOne: { filter: { z: 9 }, replacement: { z: 9, replaced: true }, upsert: true } },
]);
// 3
await collection.countDocuments({}, 100);
// [true]
await collection.distinct('replaced');
})();
// Synchronous
BulkWriteResult bulkWrite(List<Command> commands);
BulkWriteResult bulkWrite(List<Command> commands, BulkWriteOptions options);
// Asynchronous
CompletableFuture<BulkWriteResult> bulkWriteAsync(List<Command> commands);
CompletableFuture<BulkWriteResult> bulkWriteAsync(List<Command> commands, BulkWriteOptions options);
Returns:
BulkWriteResult
- Wrapper with the list of responses for each command.
Parameters:
Name | Type | Summary |
---|---|---|
commands |
List of the generic |
|
options(optional) |
Provide list of options for those commands like |
Example:
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.BulkWriteOptions;
import com.datastax.astra.client.model.BulkWriteResult;
import com.datastax.astra.client.model.Command;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.internal.api.ApiResponse;
import java.util.List;
public class BulkWrite {
public static void main(String[] args) {
Collection<Document> collection = new DataAPIClient("TOKEN")
.getDatabase("API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Set a couple of Commands
Command cmd1 = Command.create("insertOne").withDocument(new Document().id(1).append("name", "hello"));
Command cmd2 = Command.create("insertOne").withDocument(new Document().id(2).append("name", "hello"));
// Set the options for the bulk write
BulkWriteOptions options1 = BulkWriteOptions.Builder.ordered(false).concurrency(1);
// Execute the queries
BulkWriteResult result = collection.bulkWrite(List.of(cmd1, cmd2), options1);
// Retrieve the LIST of responses
for(ApiResponse res : result.getResponses()) {
System.out.println(res.getData());
}
}
}
This operation has no literal equivalent in HTTP. Instead, you can execute multiple, sequential write operations.