Documents reference
Documents represent a single row or record of data in a keyspace.
You use the Collection class to work with documents.
If you haven’t done so already, consult the Collections reference topic for details on how to get a Collection object.
Working with dates
- Python
-
Date and datetime objects, which are instances of the Python standard library
datetime.datetimeanddatetime.dateclasses, can be used anywhere in documents.collection.insert_one({"when": datetime.datetime.now()}) collection.insert_one({"date_of_birth": datetime.date(2000, 1, 1)}) collection.update_one( {"registered_at": datetime.date(1999, 11, 14)}, {"$set": {"message": "happy Sunday!"}}, ) print( collection.find_one( {"date_of_birth": {"$lt": datetime.date(2001, 1, 1)}}, projection={"_id": False}, ) ) # will print: # {'date_of_birth': datetime.datetime(2000, 1, 1, 0, 0)}As shown in the example, read operations from a collection always return the
datetimeclass regardless of whether adateor adatetimewas provided in the insertion. - TypeScript
-
Native JS
Dateobjects can be used anywhere in documents to represent dates and times.Document fields stored using the
{ $date: number }will also be returned asDateobjects when read.(async function () { // Create an untyped collection const collection = await db.createCollection('dates_test', { checkExists: false }); // Insert documents with some dates await collection.insertOne({ dateOfBirth: new Date(1394104654000) }); await collection.insertOne({ dateOfBirth: new Date('1863-05-28') }); // Update a document with a date and setting lastModified to now await collection.updateOne( { dateOfBirth: new Date('1863-05-28'), }, { $set: { message: 'Happy Birthday!' }, $currentDate: { lastModified: true }, }, ); // Will print aroundnew Date()const found = await collection.findOne({ dateOfBirth: { $lt: new Date('1900-01-01') } }); console.log(found?.lastModified); })(); - Java
-
Data API is using the
ejsonstandard to represents time-related objects. The client introducing custom serializers but 3 types of objectsjava.util.Date,java.util.Calendar,java.util.Instant.Those objects can be used naturally both in filter clauses, update clauses and or in documents.
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.FindOneOptions; import com.datastax.astra.client.model.Projections; import java.time.Instant; import java.util.Calendar; import java.util.Date; import static com.datastax.astra.client.model.Filters.eq; import static com.datastax.astra.client.model.Filters.lt; import static com.datastax.astra.client.model.Updates.set; public class WorkingWithDates { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); Calendar c = Calendar.getInstance(); collection.insertOne(new Document().append("registered_at", c)); collection.insertOne(new Document().append("date_of_birth", new Date())); collection.insertOne(new Document().append("just_a_date", Instant.now())); collection.updateOne( eq("registered_at", c), // filter clause set("message", "happy Sunday!")); // update clause collection.findOne( lt("date_of_birth", new Date(System.currentTimeMillis() - 1000 * 1000)), new FindOneOptions().projection(Projections.exclude("_id"))); } }
Working with document IDs
Documents in a collection are always identified by an ID that is unique within the collection.
The ID can be any of several types, such as a string, integer, or datetime. However, it’s recommended to instead prefer the uuid or the ObjectId types.
The Data API supports uuid identifiers up to version 8, as well as ObjectId identifiers as provided by the bson library.
These can appear anywhere in the document, not only in its _id field. Moreover, different types of identifier can appear in different parts of the same document. And these identifiers can be part of filtering clauses and update/replace directives just like any other data type.
One of the optional settings of a collection is the "default ID type": that is, it is possible to specify what kind of identifiers the server should supply
for documents without an explicit _id field. (For details, see the create_collection method and Data API createCollection command in the Collections reference.) Regardless of the defaultId setting, however, identifiers of any type can be explicitly provided for documents at any time and will be honored by the API, for example when inserting documents.
- Python
-
from astrapy.ids import ( ObjectId, uuid1, uuid3, uuid4, uuid5, uuid6, uuid7, uuid8, UUID, )AstraPy recognizes
uuidversions 1 through 8 (with the exception of 2) as provided by theuuidanduuid6Python libraries, as well as theObjectIdfrom thebsonpackage. Furthermore, out of convenience, these same utilities are exposed in AstraPy directly, as shown in the example above.You can then generate new identifiers with statements such as
new_id = uuid8()ornew_obj_id = ObjectId(). Keep in mind that alluuidversions are instances of the same class (UUID), which exposes aversionproperty, should you need to access it.Here is a short example:
collection.insert_one({"_id": uuid8(), "tag": "new_id_v_8"}) collection.insert_one( {"_id": UUID("018e77bc-648d-8795-a0e2-1cad0fdd53f5"), "tag": "id_v_8"} ) collection.insert_one({"id": ObjectId(), "tag": "new_obj_id"}) collection.insert_one( {"id": ObjectId("6601fb0f83ffc5f51ba22b88"), "tag": "obj_id"} ) collection.find_one_and_update( {"_id": ObjectId("6601fb0f83ffc5f51ba22b88")}, {"$set": {"item_inventory_id": UUID("1eeeaf80-e333-6613-b42f-f739b95106e6")}}, ) - TypeScript
-
import { UUID, ObjectId } from '@datastax/astra-db-ts';astra-db-ts provides the
UUIDandObjectIdclasses for using and generating new identifiers. Note that these are not the same as exported from thebsonoruuidlibraries, but rather are custom classes that must be imported from theastra-db-tspackage.You can generate new identifiers using
UUID.v4(),UUID.v7(), ornew ObjectId(). The UUID methods all return an instance of the same class, but it exposes aversionproperty, should you need to access it. They may also be constructed from a string representation of the IDs if custom generation is desired.Here is a short example of the concepts:
import { DataAPIClient, UUID, ObjectId } from '@datastax/astra-db-ts'; // Schema for the collection interface Person { _id: UUID | ObjectId; name: string; friendId?: UUID; } // Insert documents w/ various IDs await collection.insertOne({ name: 'John', _id: UUID.v4() }); await collection.insertOne({ name: 'Jane', _id: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') }); await collection.insertOne({ name: 'Dan', _id: new ObjectId()}); await collection.insertOne({ name: 'Tim', _id: new ObjectId('65fd9b52d7fabba03349d013') }); // Update a document with a UUID in a non-_id field await collection.updateOne( { name: 'John' }, { $set: { friendId: new UUID('016b1cac-14ce-660e-8974-026c927b9b91') } }, ); // Find a document by a UUID in a non-_id field const john = await collection.findOne({ name: 'John' }); const jane = await collection.findOne({ _id: john!.friendId }); // Prints 'Jane 016b1cac-14ce-660e-8974-026c927b9b91 6' console.log(jane?.name, jane?._id.toString(), (<UUID>jane?._id).version); })(); - Java
-
To cope with different implementations of
UUID(v6 and v7 especially) dedicated classes have been defined.When an unique identifier is retrieved from the server, it is returned as a
uuidand will be converted to the appropriateUUIDclass leveraging the class definition in thedefaultIdoption.The
ObjectIdclass is extracted from the Bson package and is used to represent theObjectIdtype.package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.ObjectId; import com.datastax.astra.client.model.UUIDv6; import com.datastax.astra.client.model.UUIDv7; import java.time.Instant; import java.util.UUID; import static com.datastax.astra.client.model.Filters.eq; import static com.datastax.astra.client.model.Updates.set; public class WorkingWithDocumentIds { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Ids can be different Json scalar // ('defaultId' options NOT set for collection) new Document().id("abc"); new Document().id(123); new Document().id(Instant.now()); // Working with UUIDv4 new Document().id(UUID.randomUUID()); // Working with UUIDv6 collection.insertOne(new Document().id(new UUIDv6()).append("tag", "new_id_v_6")); UUID uuidv4 = UUID.fromString("018e77bc-648d-8795-a0e2-1cad0fdd53f5"); collection.insertOne(new Document().id(new UUIDv6(uuidv4)).append("tag", "id_v_8")); // Working with UUIDv7 collection.insertOne(new Document().id(new UUIDv7()).append("tag", "new_id_v_7")); // Working with ObjectIds collection.insertOne(new Document().id(new ObjectId()).append("tag", "obj_id")); collection.insertOne(new Document().id(new ObjectId("6601fb0f83ffc5f51ba22b88")).append("tag", "obj_id")); collection.findOneAndUpdate( eq((new ObjectId("6601fb0f83ffc5f51ba22b88"))), set("item_inventory_id", UUID.fromString("1eeeaf80-e333-6613-b42f-f739b95106e6"))); } }Java natural
UUIDare implemented using the UUID v4 standard.
Insert a single document
Insert a single document into a collection.
- Python
-
View this topic in more detail on the API Reference.
insert_result = collection.insert_one({"name": "Jane Doe"})Insert a document with an associated vector.
insert_result = collection.insert_one( { "name": "Jane Doe", "$vector": [.08, .68, .30], }, )Insert a document and generate a vector automatically.
insert_result = collection.insert_one( { "name": "Jane Doe", "$vectorize": "Text to vectorize", }, )Returns:
InsertOneResult- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.Example responseInsertOneResult(raw_results=[{'status': {'insertedIds': ['92b4c4f4-db44-4440-b4c4-f4db44e440b8']}}], inserted_id='92b4c4f4-db44-4440-b4c4-f4db44e440b8')Parameters:
Name Type Summary document
DictThe dictionary expressing the document to insert. The
_idfield of the document can be left out, in which case it will be created automatically.vector
Optional[Iterable[float]]A vector (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the "$vector" field of the document itself, however the two are mutually exclusive.
vectorize
Optional[str]A string to be vectorized. This only works for collections associated with an embedding service.
max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. If not passed, the collection-level setting is used instead.
Example:
# Insert a document with a specific ID response1 = collection.insert_one( { "_id": 101, "name": "John Doe", "$vector": [.12, .52, .32], }, ) # Insert a document without specifying an ID # so that_idis generated automatically response2 = collection.insert_one( { "name": "Jane Doe", "$vector": [.08, .68, .30], }, ) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.insertOne({ name: 'Jane Doe' });Insert a document with an associated vector.
const result = await collection.insertOne( { name: 'Jane Doe', $vector: [.08, .68, .30], }, );Insert a document and generate a vector automatically.
const result = await collection.insertOne( { name: 'Jane Doe', $vectorize: 'Text to vectorize', }, );Parameters:
Name Type Summary document
The document to insert. If the document does not have an
_idfield, the server generates one.options?
The options for this operation.
Options (
InsertOneOptions):Name Type Summary number[]The vector for the document.
Equivalent to providing the vector in the
$vectorfield of the document itself; however, the two are mutually exclusive.stringA string to be vectorized. This only works for collections associated with an embedding service.
numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Returns:
Promise<InsertOneResult<Schema>>- A promise that resolves to the inserted ID.Example:
(async function () { // Insert a document with a specific ID await collection.insertOne({ _id: '1', name: 'John Doe' }); // Insert a document with an autogenerated ID await collection.insertOne({ name: 'Jane Doe' }); // Insert a document with a vector await collection.insertOne({ name: 'Jane Doe', $vector: [.12, .52, .32] }); })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.InsertOneResult insertOne(DOC document); InsertOneResult insertOne(DOC document, float[] embeddings); // Equivalent in asynchronous CompletableFuture<InsertOneResult> insertOneAsync(DOC document); CompletableFuture<InsertOneResult> insertOneAsync(DOC document, float[] embeddings);Returns:
InsertOneResult- Wrapper with the inserted document Id.Parameters:
Name Type Summary documentDOCObject representing the document to insert. The
_idfield of the document can be left out, in which case it will be created automatically. If the collection is associated with an embedding service, it will generate a vector automatically from the$vectorizefield.embeddingsfloat[]A vector of embeddings (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the
$vectorfield of the document itself.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.InsertOneOptions; import com.datastax.astra.client.model.InsertOneResult; import com.fasterxml.jackson.annotation.JsonProperty; import lombok.AllArgsConstructor; import lombok.Data; public class InsertOne { @Data @AllArgsConstructor public static class Product { @JsonProperty("_id") private String id; private String name; } public static void main(String[] args) { // Given an existing collection Collection<Document> collectionDoc = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Insert a document Document doc1 = new Document("1").append("name", "joe"); InsertOneResult res1 = collectionDoc.insertOne(doc1); System.out.println(res1.getInsertedId()); // should be "1" // Insert a document with embeddings Document doc2 = new Document("2").append("name", "joe"); collectionDoc.insertOne(doc2, new float[] {.1f, .2f}); // Given an existing collection Collection<Product> collectionProduct = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION2_NAME", Product.class); // Insert a document with custom bean collectionProduct.insertOne(new Product("1", "joe")); collectionProduct.insertOne(new Product("2", "joe"), new float[] {.1f, .2f}); } }
Insert many documents
Insert multiple documents into a collection.
- Python
-
View this topic in more detail on the API Reference.
response = collection.insert_many( [ { "_id": 101, "name": "John Doe", "$vector": [.12, .52, .32], }, { # ID is generated automatically "name": "Jane Doe", "$vector": [.08, .68, .30], }, ], )Insert multiple documents and generate vectors automatically.
response = collection.insert_many( [ { "name": "John Doe", "$vectorize": "Text to vectorize for John Doe", }, { "name": "Jane Doe", "$vectorize": "Text to vectorize for Jane Doe", }, ], )Returns:
InsertManyResult- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.Example responseInsertManyResult(raw_results=[{'status': {'insertedIds': [101, '81077d86-05dc-43ca-877d-8605dce3ca4d']}}], inserted_ids=[101, '81077d86-05dc-43ca-877d-8605dce3ca4d'])Parameters:
Name Type Summary documents
Iterable[Dict[str, Any]],An iterable of dictionaries, each a document to insert. Documents may specify their
_idfield or leave it out, in which case it will be added automatically.vectors
Optional[Iterable[Optional[Iterable[float]]]]An optional list of vectors (as many vectors as the provided documents) to associate to the documents when inserting. Each vector is added to the corresponding document prior to insertion on database. The list can be a mixture of None and vectors, in which case some documents will not have a vector, unless it is specified in their "$vector" field already. Passing vectors this way is indeed equivalent to the "$vector" field of the documents, however the two are mutually exclusive.
vectorize
Optional[Iterable[Optional[str]]]An optional list of strings to be vectorized. This only works for collections associated with an embedding service.
ordered
boolIf False (default), the insertions can occur in arbitrary order and possibly concurrently. If True, they are processed sequentially. If you don’t need ordered inserts, DataStax recommends setting this parameter to False for faster performance.
chunk_size
Optional[int]How many documents to include in a single API request. The default and maximum value is 20.
concurrency
Optional[int]Maximum number of concurrent requests to the API at a given time. It cannot be more than one for ordered insertions.
max_time_ms
Optional[int]A timeout, in milliseconds, for the operation. If not passed, the collection-level setting is used instead: If you are inserting many documents, this method will require multiple HTTP requests. You may need to increase the timeout duration for the method to complete successfully.
Unless there are specific reasons not to, it is recommended to prefer
ordered = Falseas it will result in a much higher insert throughput than an equivalent ordered insertion.Example:
collection.insert_many([{"a": 10}, {"a": 5}, {"b": [True, False, False]}]) collection.insert_many( [{"seq": i} for i in range(50)], concurrency=5, ) collection.insert_many( [ {"tag": "a", "$vector": [1, 2]}, {"tag": "b", "$vector": [3, 4]}, ] ) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.insertMany([ { _id: '1', name: 'John Doe', $vector: [.12, .52, .32], }, { name: 'Jane Doe', $vector: [.08, .68, .30], }, ], { ordered: true, });Insert multiple documents and generate vectors automatically.
const result = await collection.insertMany([ { name: 'John Doe', $vectorize: 'Text to vectorize for John Doe', }, { name: 'Jane Doe', $vectorize: 'Text to vectorize for Jane Doe', }, ], { ordered: true, });Parameters:
Name Type Summary documents
The documents to insert. If any document does not have an
_idfield, the server generates one.options?
The options for this operation.
Options (
InsertManyOptions):Name Type Summary booleanYou may set the
orderedoption totrueto stop the operation after the first error; otherwise all documents may be parallelized and processed in arbitrary order, improving, perhaps vastly, performance.numberYou can set the
concurrencyoption to control how many network requests are made in parallel on unordered insertions. Defaults to8.Not available for ordered insertions.
numberControl how many documents are sent each network request. The default and maximum value is 20.
(number[] | null | undefined)[]An array of vectors to associate with each document. If a vector is
nullorundefined, the document will not have a vector. Must equal the number of documents if provided.Equivalent to providing the vector in the
$vectorfield of the documents themselves; however, the two are mutually exclusive.string[]An array of strings to be vectorized. This only works for collections associated with an embedding service.
numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Unless there are specific reasons not to, it is recommended to prefer to leave ordered
falseas it will result in a much higher insert throughput than an equivalent ordered insertion.Returns:
Promise<InsertManyResult<Schema>>- A promise that resolves to the inserted IDs.Example:
(async function () { try { // Insert many documents await collection.insertMany([ { _id: '1', name: 'John Doe' }, { name: 'Jane Doe' }, // Will autogen ID ], { ordered: true }); // Insert many with vectors await collection.insertMany([ { name: 'John Doe', $vector: [.12, .52, .32] }, { name: 'Jane Doe', $vector: [.32, .52, .12] }, ]); } catch (e) { if (e instanceof InsertManyError) { console.log(e.partialResult); } } })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous InsertManyResult insertMany(List<? extends DOC> documents); InsertManyResult insertMany(List<? extends DOC> documents, InsertManyOptions options); // Asynchronous CompletableFuture<InsertManyResult> insertManyAsync(List<? extends DOC> docList); CompletableFuture<InsertManyResult> insertManyAsync(List<? extends DOC> docList, InsertManyOptions options);Returns:
InsertManyResult- Wrapper with the list of inserted document ids.Parameters:
Name Type Summary docListList<? extends DOC>A list of documents to insert. Documents may specify their
_idfield or leave it out, in which case it will be added automatically. If the collection is associated with an embedding service, it will generate vectors automatically from the$vectorizefield in each document. You can also set the$vectorfield directly.options(optional)Set the different options for the insert operation. The options are
ordered,concurrency,chunkSize.The java operation
insertManycan take as many documents as you want as long as it fits in your JVM memory. It will split the documents in chunks ofchunkSizeand send them to the server in a distributed way through anExecutorService. The default and maximum value ofchunkSizeis 20. To set the size of the executor useconcurrency.InsertManyOptions.Builder .chunkSize(20) // batch size, 20 is max .concurrency(8) // concurrent insertions .ordered(false) // unordered insertions .build();If not provided the default values are
chunkSize=20,concurrency=1andordered=false.It is recommended to work with
ordered=falsefor performance reasons. It would then insert chunks in parallels.Try to always provide the
InsertManyOptionseven when using default, it brings visibility to the readers.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.InsertManyOptions; import com.datastax.astra.client.model.InsertManyResult; import com.datastax.astra.client.model.InsertOneResult; import com.fasterxml.jackson.annotation.JsonProperty; import lombok.AllArgsConstructor; import lombok.Data; import java.util.List; public class InsertMany { @Data @AllArgsConstructor public static class Product { @JsonProperty("_id") private String id; private String name; } public static void main(String[] args) { // Given an existing collection Collection<Document> collectionDoc = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Insert a document Document doc1 = new Document("1").append("name", "joe"); Document doc2 = new Document("2").append("name", "joe"); InsertManyResult res1 = collectionDoc.insertMany(List.of(doc1, doc2)); System.out.println("Identifiers inserted: " + res1.getInsertedIds()); // Given an existing collection Collection<Product> collectionProduct = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION2_NAME", Product.class); // Insert a document with embeddings InsertManyOptions options = new InsertManyOptions() .chunkSize(20) // how many process per request .concurrency(1) // parallel processing .ordered(false) // allows parallel processing .timeout(1000); // timeout in millis InsertManyResult res2 = collectionProduct.insertMany( List.of(new Product("1", "joe"), new Product("2", "joe")), options); } }
Find a document
Retrieve a single document from a collection using various options.
- Python
-
View this topic in more detail on the API Reference.
Retrieve a single document from a collection by its
_id.document = collection.find_one({"_id": 101})Retrieve a single document from a collection by any attribute, as long as it is covered by the collection’s indexing configuration.
As noted in The Indexing option in the Collections reference topic, any field that is part of a subsequent filter or sort operation must be an indexed field. If you elected to not index certain or all fields when you created the collection, you cannot reference that field in a filter/sort query.
document = collection.find_one({"location": "warehouse_C"})Retrieve a single document from a collection by an arbitrary filtering clause.
document = collection.find_one({"tag": {"$exists": True}})Retrieve the most similar document to a given vector.
result = collection.find_one({}, vector=[.12, .52, .32])Generate a vector and retrieve the most similar document.
result = collection.find_one({}, vectorize="Text to vectorize")Retrieve only specific fields from a document.
result = collection.find_one({"_id": 101}, projection={"name": True})Returns:
Union[Dict[str, Any], None]- Either the found document as a dictionary or None if no matching document is found.Example response{'_id': 101, 'name': 'John Doe', '$vector': [0.12, 0.52, 0.32]}Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the included field names; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to exclude specific fields from the response. Special document fields (e.g.
_id,$vector) are controlled individually. The default projection does not necessarily include all fields of the document. See theprojectionexamples for more on this parameter.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, approximate nearest neighbor (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to vectorize before performing a vector search. This only works for collections associated with an embedding service. This parameter cannot be used together with
vector.include_similarity
Optional[bool]A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in the returned document. Can only be used for vector ANN search, i.e. when either
vectoris supplied or thesortparameter has the shape {"$vector": …}.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the order the documents are returned. See the discussion about sorting for details.
max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
collection.find_one() # prints: {'_id': '68d1e515-...', 'seq': 37} collection.find_one({"seq": 10}) # prints: {'_id': 'd560e217-...', 'seq': 10} collection.find_one({"seq": 1011}) # (returns None for no matches) collection.find_one(projection={"seq": False}) # prints: {'_id': '68d1e515-...'} collection.find_one( {}, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) # prints: {'_id': '97e85f81-...', 'seq': 69} collection.find_one(vector=[1, 0], projection={"*": True}) # prints: {'_id': '...', 'tag': 'D', '$vector': [4.0, 1.0]} - TypeScript
-
View this topic in more detail on the API Reference.
Retrieve a single document from a collection by its
_id.const doc = await collection.findOne({ _id: '101' });Retrieve a single document from a collection by any attribute, as long as it is covered by the collection’s indexing configuration.
As noted in The Indexing option in the Collections reference topic, any field that is part of a subsequent filter or sort operation must be an indexed field. If you elected to not index certain or all fields when you created the collection, you cannot reference that field in a filter/sort query.
const doc = await collection.findOne({ location: 'warehouse_C' });Retrieve a single document from a collection by an arbitrary filtering clause.
const doc = await collection.findOne({ tag: { $exists: true } });Retrieve the most similar document to a given vector.
const doc = await collection.findOne({}, { vector: [.12, .52, .32] });Generate a vector and retrieve the most similar document.
const doc = await collection.findOne({}, { vectorize: 'Text to vectorize' });Retrieve only specific fields from a document.
const doc = await collection.findOne({ _id: '101' }, { projection: { name: 1 } });Parameters:
Name Type Summary filter
A filter to select the document to find.
options?
The options for this operation.
Options (
FindOneOptions):Name Type Summary Specifies which fields should be included/excluded in the returned documents. Defaults to including all fields.
When specifying a projection, it’s the user’s responsibility to handle the return type carefully. Consider type-casting.
booleanRequests the numeric value of the similarity to be returned as an added
$similaritykey in the returned document.Can only be used when performing a vector search.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to vectorize before performing a vector search. This only works for collections associated with an embedding service. This parameter cannot be used together with
vector.numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Returns:
Promise<FoundDoc<Schema> | null>- A promise that resolves to the found document (inc.$similarityif applicable), ornullif no matching document is found.Example:
(async function () { // Insert some documents await collection.insertMany([ { name: 'John', age: 30, $vector: [1, 1, 1, 1, 1] }, { name: 'Jane', age: 25, }, { name: 'Dave', age: 40, }, ]); // Unpredictably prints one of their names const unpredictable = await collection.findOne({}); console.log(unpredictable?.name); // Failed find by name (null) const failed = await collection.findOne({ name: 'Carrie' }); console.log(failed); // Find by $gt age (Dave) const dave = await collection.findOne({ age: { $gt: 30 } }); console.log(dave?.name); // Find by sorting by age (Jane) const jane = await collection.findOne({}, { sort: { age: 1 } }); console.log(jane?.name); // Find by vector similarity (John, 1) const john = await collection.findOne({}, { vector: [1, 1, 1, 1, 1], includeSimilarity: true }); console.log(john?.name, john?.$similarity); })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous Optional<T> findOne(Filter filter); Optional<T> findOne(Filter filter, FindOneOptions options); Optional<T> findById(Object id); // build the filter for you // Asynchronous CompletableFuture<Optional<DOC>> findOneAsync(Filter filter); CompletableFuture<Optional<DOC>> findOneAsync(Filter filter, FindOneOptions options); CompletableFuture<Optional<DOC>> findByIdAsync(Filter filter);Returns:
[
Optional<T>] - Return the working document matching the filter orOptional.empty()if no document is found.Parameters:
Name Type Summary filterFilterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
options(optional)Set the different options for the
findOneoperation. The options are asortclause, someprojectionto retrieve sub parts of the documents and a flag to include the similarity in case of a vector search.Things you must know about Data API requests:
-
A
Filteris a JSON expression that accepts different operators listed on the Data API command page. -
A
Projectionis a list of flags that indicate if you want to retrieve a field or not -
The
sortclause is used either for similarity search or order results -
In
optionsyou will reveal if you want to include the similarity in the result
{ "findOne": { "filter": { "$and": [ {"field2": {"$gt": 10}}, {"field3": {"$lt": 20}}, {"field4": {"$eq": "value"}} ] }, "projection": { "_id": 0, "field": 1, "field2": 1, "field3": 1 }, "sort": { "$vector": [ 0.25, 0.25, 0.25,0.25, 0.25] }, "options": { "includeSimilarity": true } } }To execute this exact query with Java:
collection.findOne( Filters.and( Filters.gt("field2", 10), Filters.lt("field3", 20), Filters.eq("field4", "value") ), new FindOneOptions() .projection(Projections.include("field", "field2", "field3")) .projection(Projections.exclude("_id")) .vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) .includeSimilarity() ) ); // with the import Static Magic collection.findOne( and( gt("field2", 10), lt("field3", 20), eq("field4", "value") ), vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) .projection(Projections.include("field", "field2", "field3")) .projection(Projections.exclude("_id")) .includeSimilarity() );Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.DataAPIOptions; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.FindOneOptions; import java.util.Optional; import static com.datastax.astra.client.model.Filters.and; import static com.datastax.astra.client.model.Filters.eq; import static com.datastax.astra.client.model.Filters.gt; import static com.datastax.astra.client.model.Filters.lt; import static com.datastax.astra.client.model.Projections.exclude; import static com.datastax.astra.client.model.Projections.include; public class FindOne { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Complete FindOne Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); FindOneOptions options = new FindOneOptions() .projection(include("field", "field2", "field3")) .projection(exclude("_id")) .sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) .includeSimilarity(); Optional<Document> result = collection.findOne(filter, options); // with the import Static Magic collection.findOne(and( gt("field2", 10), lt("field3", 20), eq("field4", "value")), new FindOneOptions().sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) .projection(include("field", "field2", "field3")) .projection(exclude("_id")) .includeSimilarity() ); // find one with a vectorize collection.findOne(and( gt("field2", 10), lt("field3", 20), eq("field4", "value")), new FindOneOptions().sort("Life is too short to be living somebody else's dream.") .projection(include("field", "field2", "field3")) .projection(exclude("_id")) .includeSimilarity() ); collection.insertOne(new Document() .append("field", "value") .append("field2", 15) .append("field3", 15) .vectorize("Life is too short to be living somebody else's dream.")); } } -
Find documents using filtering options
Iterate over documents in a collection matching a given filter.
- Python
-
View this topic in more detail on the API Reference.
doc_iterator = collection.find({"category": "house_appliance"}, limit=10)Iterate over the documents most similar to a given query vector.
doc_iterator = collection.find({}, vector=[0.55, -0.40, 0.08], limit=5)Generate a vector and iterate over the documents most similar to it.
doc_iterator = collection.find({}, vectorize="Text to vectorize", limit=5)Returns:
Cursor- A cursor for iterating over documents. An AstraPy cursor can be used in a for loop, and provides a few additional features.Example responseCursor("vector_collection", new, retrieved: 0)Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the included field names; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to exclude specific fields from the response. Special document fields (e.g.
_id,$vector) are controlled individually. The default projection does not necessarily include all fields of the document. See theprojectionexamples for more on this parameter.skip
Optional[int]With this integer parameter, what would be the first
skipdocuments returned by the query are discarded, and the results start from the (skip+1)-th document. This parameter can be used only in conjunction with an explicitsortcriterion of the ascending/descending type (i.e. it cannot be used when not sorting, nor with vector-based ANN search).limit
Optional[int]This (integer) parameter sets a limit over how many documents are returned. Once
limitis reached (or the cursor is exhausted for lack of matching documents), nothing more is returned.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search; that is, approximate nearest neighbor (ANN) search. When running similarity search on a collection, no other sorting criteria can be specified. Moreover, there is an upper bound to the number of documents that can be returned. For details, see the Data API Limits.
vectorize
Optional[str]A string to vectorize before performing a vector search. This only works for collections associated with an embedding service. This parameter cannot be used together with
vector.include_similarity
Optional[bool]A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in each returned document. Can only be used for vector ANN search, i.e. when either
vectoris supplied or thesortparameter has the shape {"$vector": …}.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the order the documents are returned. See the discussion about sorting, including the note on upper bounds on the number of visited documents, for details.
max_time_ms
Optional[int]A timeout, in milliseconds, for each underlying HTTP request used to fetch documents as you iterate over the cursor. This method uses the collection-level timeout by default.
Example:
# Find all documents in the collection list(collection.find({})) # Find all documents in the collection with a specific field value list(collection.find({ "a": 123, })) # Find all documents in the collection that match a compound filter expression list(collection.find({ "$and": [ {"f1": 1}, {"f2": 2}, ] })) # Same as the preceeding example, but using the implicit AND operator list(collection.find({ "f1": 1, "f2": 2, })) # Use the "less than" operator in the filter expression list(collection.find({ "$and": [ {"name": "John"}, {"price": {"$lt": 100}}, ] })) - TypeScript
-
View this topic in more detail on the API Reference.
const cursor = collection.find({ category: 'house_appliance' }, { limit: 10 });Iterate over the documents most similar to a given query vector.
const cursor = collection.find({}, { vector: [0.55, -0.40, 0.08], limit: 5 });Generate a vector and iterate over the documents most similar to it.
const cursor = collection.find({}, { vectorize: 'Text to vectorize', limit: 5 });Parameters:
Name Type Summary filter
A filter to select the document to find.
options?
The options for this operation.
Options (
FindOptions):Name Type Summary Specifies which fields should be included/excluded in the returned documents. Defaults to including all fields.
When specifying a projection, it’s the user’s responsibility to handle the return type carefully. Consider type-casting.
booleanRequests the numeric value of the similarity to be returned as an added
$similaritykey in the returned document.Can only be used when performing a vector search.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to vectorize before performing a vector search. This only works for collections associated with an embedding service. This parameter cannot be used together with
vector.numberThe number of documents to skip before returning the first document.
numberThe maximum number of documents to return in the lifetime of the cursor.
numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
Returns:
FindCursor<FoundDoc<Schema>>- A cursor for iterating over the matching documents.Example:
(async function () { // Insert some documents await collection.insertMany([ { name: 'John', age: 30, $vector: [1, 1, 1, 1, 1] }, { name: 'Jane', age: 25, }, { name: 'Dave', age: 40, }, ]); // Gets all 3 in some order const unpredictable = await collection.find({}).toArray(); console.log(unpredictable); // Failed find by name ([]) const matchless = await collection.find({ name: 'Carrie' }).toArray(); console.log(matchless); // Find by $gt age (John, Dave) const gtAgeCursor = collection.find({ age: { $gt: 25 } }); for await (const doc of gtAgeCursor) { console.log(doc.name); } // Find by sorting by age (Jane, John, Dave) const sortedAgeCursor = collection.find({}, { sort: { age: 1 } }); await sortedAgeCursor.forEach(console.log); // Find first by vector similarity (John, 1) const john = await collection.find({}, { vector: [1, 1, 1, 1, 1], includeSimilarity: true }).next(); console.log(john?.name, john?.$similarity); })(); - Java
-
Operations on documents are performed at
Collectionlevel. To get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous FindIterable<T> find(Filter filter, FindOptions options); // Helper to build filter and options above ^ FindIterable<T> find(FindOptions options); // no filter FindIterable<T> find(Filter filter); // default options FindIterable<T> find(); // default options + no filters FindIterable<T> find(float[] vector, int limit); // semantic search FindIterable<T> find(Filter filter, float[] vector, int limit);Returns:
FindIterable<T>- A cursor where the first up to 20 documents are fetched and the rest are fetched as needed. As the same stated it is anIterable.Parameters:
Name Type Summary filterFilterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
options(optional)Set the different options for the
findoperation. The options are asortclause, someprojectionto retrieve sub parts of the documents and a flag to include the similarity in case of a vector search.The
FindIterableis anIterableand can be used in aforloop to iterate over the documents.The
FindIterablewill fetch the documents in chunks of 20, and will fetch more as needed. TheFindIterableis a lazy iterator, meaning that it will only fetch the next chunk of documents when needed. sIt provides the method
.all()to exhaust it but should be used with caution.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.FindIterable; import com.datastax.astra.client.model.FindOptions; import com.datastax.astra.client.model.Sorts; import static com.datastax.astra.client.model.Filters.lt; import static com.datastax.astra.client.model.Projections.exclude; import static com.datastax.astra.client.model.Projections.include; public class Find { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // Find Options FindOptions options = new FindOptions() .projection(include("field", "field2", "field3")) // select fields .projection(exclude("_id")) // exclude some fields .sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}) // similarity vector .skip(1) // skip first item .limit(10) // stop after 10 items (max records) .pageState("pageState") // used for pagination .includeSimilarity(); // include similarity // Execute a find operation FindIterable<Document> result = collection.find(filter, options); // Iterate over the result for (Document document : result) { System.out.println(document); } } }
Example values for sort operations
- Python
-
When no particular order is required:
sort={} # (default when parameter not provided)When sorting by a certain value in ascending/descending order:
from astrapy.constants import SortDocuments sort={"field": SortDocuments.ASCENDING} sort={"field": SortDocuments.DESCENDING}When sorting first by "field" and then by "subfield" (while modern Python versions preserve the order of dictionaries, it is suggested for clarity to employ a
collections.OrderedDictin these cases):sort={ "field": SortDocuments.ASCENDING, "subfield": SortDocuments.ASCENDING, }When running a vector similarity (ANN) search:
sort={"$vector": [0.4, 0.15, -0.5]}Generate a vector to perform a vector similarity search. The collection must be associated with an embedding service.
sort={"$vectorize": "Text to vectorize"}Some combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:
-
Vector ANN searches cannot return more than a certain number of documents; currently, 1000 per search operation.
-
When using a sort criterion of the ascending/descending type, the Data API returns a smaller number of documents, currently set to 20, and stops there. The returned documents are the top results across the whole collection according to the requested criterion.
Keep in mind these provisions even when subsequently running a command such as
.distinct()on a cursor.When not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents.
The behavior of the cursor — in the case that documents have been added/removed after the
findwas started — depends on database internals. It it is not guaranteed, nor excluded, that such "real-time" changes in the data would be picked up by the cursor.Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection filter = {"seq": {"$exists": True}} for doc in collection.find(filter, projection={"seq": True}, limit=5): print(doc["seq"]) ... # will print e.g.: # 37 # 35 # 10 # 36 # 27 cursor1 = collection.find( {}, limit=4, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) [doc["_id"] for doc in cursor1] # prints: ['97e85f81-...', '1581efe4-...', '...', '...'] cursor2 = collection.find({}, limit=3) cursor2.distinct("seq") # prints: [37, 35, 10] collection.insert_many([ {"tag": "A", "$vector": [4, 5]}, {"tag": "B", "$vector": [3, 4]}, {"tag": "C", "$vector": [3, 2]}, {"tag": "D", "$vector": [4, 1]}, {"tag": "E", "$vector": [2, 5]}, ]) ann_tags = [ document["tag"] for document in collection.find( {}, limit=3, vector=[3, 3], ) ] ann_tags # prints: ['A', 'B', 'C'] # (assuming the collection has metric VectorMetric.COSINE) -
- TypeScript
-
Sortis very weakly typed by default—seeStrictSort<Schema>for a stronger typed alternative that provides full autocomplete as well.When no particular order is required:
{ sort: {} } // (default when parameter not provided)When sorting by a certain value in ascending/descending order:
{ sort: { field: +1 } } // ascending { sort: { field: -1 } } // descendingWhen sorting first by "field" and then by "subfield" (order matters! ES2015+ guarantees string keys in order of insertion):
{ sort: { field: 1, subfield: 1 } }When running a vector similarity (ANN) search:
{ sort: { $vector: [0.4, 0.15, -0.5] } }Generate a vector to perform a vector similarity search. The collection must be associated with an embedding service.
{ sort: { $vectorize: "Text to vectorize" } }Some combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:
-
Vector ANN searches cannot return more than a certain number of documents; currently, 1000 per search operation.
-
When using a sort criterion of the ascending/descending type, the Data API returns a smaller number of documents, currently set to 20, and stops there. The returned documents are the top results across the whole collection according to the requested criterion.
Keep in mind these provisions even when subsequently running a command such as
.distinct(), which uses a cursor underneath.When not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents.
The behavior of the cursor — in the case that documents have been added/removed after the
findwas started — depends on database internals. It is not guaranteed, nor excluded, that such "real-time" changes in the data would be picked up by the cursor.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some documents await collection.insertMany([ { name: 'Jane', age: 25, $vector: [1.0, 1.0, 1.0, 1.0, 1.0] }, { name: 'Dave', age: 40, $vector: [0.4, 0.5, 0.6, 0.7, 0.8] }, { name: 'Jack', age: 40, $vector: [0.1, 0.9, 0.0, 0.5, 0.7] }, ]); // Sort by age ascending, then by name descending (Jane, Jack, Dave) const sorted1 = await collection.find({}, { sort: { age: 1, name: -1 } }).toArray(); console.log(sorted1.map(d => d.name)); // Sort by vector distance (Jane, Dave, Jack) const sorted2 = await collection.find({}, { vector: [1, 1, 1, 1, 1] }).toArray(); console.log(sorted2.map(d => d.name)); })(); -
- Java
-
Use
sort()operations in different options only if you need them; it is optional.It is important to keep the order when chaining multiple sorts.
Sort s1 = Sorts.ascending("field1"); Sort s2 = Sorts.descending("field2"); FindOptions.Builder.sort(s1, s2);When running a vector similarity (ANN) search:
FindOptions.Builder .sort(new float[] {0.4f, 0.15f, -0.5f});Generate a vector to perform a vector similarity search.
FindOptions.Builder .sort("Text to vectorize");Some combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:
-
Vector ANN searches cannot return more than a certain number of documents; currently, 1000 per search operation.
-
When using a sort criterion of the ascending/descending type, the Data API returns a smaller number of documents, currently set to 20, and stops there. The returned documents are the top results across the whole collection according to the requested criterion.
Keep in mind these provisions even when subsequently running a command such as
.distinct()on a cursor.When not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents.
The behavior of the cursor — in the case that documents have been added/removed after the
findwas started — depends on database internals. It it is not guaranteed, nor excluded, that such "real-time" changes in the data would be picked up by the cursor.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.FindOptions; import com.datastax.astra.client.model.Sort; import com.datastax.astra.client.model.Sorts; import static com.datastax.astra.client.model.Filters.lt; public class WorkingWithSorts { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Sort Clause for a vector Sorts.vector(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f});; // Sort Clause for other fields Sort s1 = Sorts.ascending("field1"); Sort s2 = Sorts.descending("field2"); // Build the sort clause new FindOptions().sort(s1, s2); // Adding vector new FindOptions().sort(new float[] {0.25f, 0.25f, 0.25f,0.25f, 0.25f}, s1, s2); } } -
Example values for projection operations
Certain document operations — such as finding one or multiple documents, find-and-update,
find-and-replace, and find-and-delete — allow the use of a projection option to control
which part of the document(s) is returned. The projection can generally take one of two
forms: either specifying which fields to include or which fields to exclude.
If no projection, or an empty projection, is specified, a default projection is applied by the Data API.
This default projection includes at least the identifier (_id) of the document
and all its "regular" fields, which are those not starting with a dollar sign.
However, future versions of the Data API might
exclude other fields (such as $vector) from the documents by default.
When a projection is provided, specific, individually overridable
inclusion and exclusion defaults apply for "special" fields,
such as _id, $vector, and $vectorize.
Conversely, for the regular fields the projection must either list included
fields or excluded ones and cannot be a mixture of the two types of specifications.
|
In order to optimize the response size, a recommended performance improvement is to always provide, when reading, an explicit projection tailored to the needs of the application. If an application relies on the presence of A quick, if possibly suboptimal, way to ensure the presence of fields is
to use the |
A projection is expressed as a mapping of field names to boolean values.
To return the document ID, field1, and field2:
{"_id": true, "field1": true, "field2": true}
Specific fields can be excluded, keeping any other field found in the document:
{"field1": false, "field2": false}
Fields specified in the projection but not encountered in the document are simply ignored for that document.
The projection cannot mix include and exclude clauses for regular fields. In other words, it must either have all true or all false values. If a projection has false values, all non-mentioned fields found in the document are included; conversely, if it has true values, all non-mentioned fields in the document are excluded.
Special fields (_id, $vector, and $vectorize)
behave differently, in that they have their own default and their presence
can be controlled in any way within the projection.
For example, the _id field is included by default and can be excluded even in
an include-clause projection ({"_id": false, "field1": true}); conversely.
the $vector field is excluded by default and can be included even in an exclude
projection ({"field1": false, "$vector": true}).
So, the following are all valid projections:
{"_id": true, "field1": true, "field2": true}
{"_id": false, "field1": true, "field2": true}
{"_id": false, "field1": false, "field2": false}
{"_id": true, "field1": false, "field2": false}
{"_id": true, "field1": true, "field2": true, "$vector": true}
{"_id": true, "field1": true, "field2": true, "$vector": false}
{"_id": false, "field1": true, "field2": true, "$vector": true}
{"_id": false, "field1": true, "field2": true, "$vector": false}
{"_id": false, "field1": false, "field2": false, "$vector": true}
{"_id": false, "field1": false, "field2": false, "$vector": false}
{"_id": true, "field1": false, "field2": false, "$vector": true}
{"_id": true, "field1": false, "field2": false, "$vector": false}
However, the following projection is invalid and will result in an API error:
// Invalid:
{"field1": true, "field2": false}
The special projection path "*" ("star-projection"), which must be the only key in the projection,
represents the whole of the document. With the following projection all of the document
is returned:
{"*": true}
Conversely, with the following any document would return as {}:
{"*": false}
The values in a projection map can be objects, booleans or number (decimal or integer), but are then treated as booleans by the API. The following two examples include and exclude the four fields respectively:
{"field1": true, "field2": 1, "field3": 90.0, "field4": {"keep": "yes!"}}
{"field1": false, "field2": 0, "field3": 0.0, "field4": {}}
Passing null-like things (such as {}, null or 0) for the whole projection
has the same effect as not passing it altogether.
The projection cannot include the special $similarity key — which is not part
of the document but is rather computed during vector ANN queries and is controlled
through a specific includeSimilarity parameter in the search payload.
However, for array fields, a $slice can be provided to specify which elements of the array
to return. It can be in one of the following formats:
// Return the first two elements
{"arr": {"$slice": 2}}
// Return the last two elements
{"arr": {"$slice": -2}}
// Skip 4 elements (from 0th index), return the next 2
{"arr": {"$slice": [4, 2]}}
// Skip backward 4 elements (from the end), return next 2 elements (forward)
{"arr": {"$slice": [-4, 2]}}
The projection can also refer to nested fields: in that case, keys in a subdocument will be included/excluded as requested. If all keys of an existing subdocument are excluded, the document will be returned with the subdocument still present, but consisting of an empty object:
Given the following document:
{
"_id": "z",
"a": {
"a1": 10,
"a2": 20
}
}
Here the result of different projections can be seen:
| Projection | Result |
|---|---|
|
|
|
|
|
|
|
|
|
|
Referencing overlapping (sub/)paths in the projection may lead to (possibly) conflicting clauses. These are rejected, so for instance this would yield an API error:
// Invalid:
{"a.a1": true, "a": true}
- Python
-
For the Python client, the type of the
projectionargument can be not only aDict[str, Any]in compliance with the general provisions above, but it can also be a list — or other iterable — over key names. In this case it is implied that there are all included in the projection. So, the two following statements are equivalent:document = collection.find_one( {"_id": 101}, projection={"name": True, "city": True}, ) document = collection.find_one( {"_id": 101}, projection={"name": True, "city": True}, ) - TypeScript
-
The Typescript client simply takes in an untyped Plain Old JavaScript Object (POJO) for the
projectionparameter.However, it offers a
StrictProjection<Schema>type that provides full autocomplete and type checking for your document schema.import { StrictProjection } from '@datastax/astra-db-ts'; const doc = await collection.findOne({}, { projection: { 'name': true, 'address.city': true, }, }); interface MySchema { name: string, address: { city: string, state: string, }, } const doc = await collection.findOne({}, { projection: { 'name': 1, 'address.city': 1, // @ts-expect-error -'address.car'does not exist in typeStrictProjection<MySchema>'address.car': 0, // @ts-expect-error - Type{ $slice: number }is not assignable to typeboolean | 0 | 1 | undefined'address.state': { $slice: 3 } } satisfies StrictProjection<MySchema>, }); - Java
-
To support the projection mechanism, the different
Optionsclasses provide theprojectionmethod in the helpers. This method takes an array ofProjectionclasses providing the field name and a boolean flag to choose between inclusion and exclusion.Projection p1 = new Projection("field1", true); Projection p2 = new Projection("field2", true); FindOptions options1 = FindOptions.Builder.projection(p1, p2);This syntax can be simplified by leveraging the syntactic sugar called
Projections:FindOptions options2 = FindOptions.Builder .projection(Projections.include("field1", "field2")); FindOptions options3 = FindOptions.Builder .projection(Projections.exclude("field1", "field2"));When it comes to support of
$slicefor array fields, theProjectionclass provides a method as well:// {"arr": {"$slice": 2}} Projection sliceOnlyStart = Projections.slice("arr", 2, null); // {"arr": {"$slice": [-4, 2]}} Projection sliceOnlyRange =Projections.slice("arr", -4, 2); // An you can use then freely in the different builders FindOptions options4 = FindOptions.Builder .projection(sliceOnlyStart);
Find and update a document
Locate a document matching a filter condition and apply changes to it, returning the document itself.
- Python
-
View this topic in more detail on the API Reference.
collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, )Locate and update a document, returning the document itself, creating a new one if nothing is found.
collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, upsert=True, )Returns:
Dict[str, Any]- The document that was found, either before or after the update (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response{'_id': 999, 'Marco': 'Polo'}Parameters:
Name Type Summary filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the document being returned. The projection can be: an iterable over the included field names; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to exclude specific fields from the response. Special document fields (e.g.
_id,$vector) are controlled individually. The default projection does not necessarily include all fields of the document. See theprojectionexamples for more on this parameter.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, approximate nearest neighbor (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the
sortexamples for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.return_document
strA flag controlling what document is returned: if set to
ReturnDocument.BEFORE, or the string "before", the document found on database is returned; if set toReturnDocument.AFTER, or the string "after", the new document is returned. The default is "before".max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, ) # prints: {'_id': 'a80106f2-...', 'Marco': 'Polo'} collection.find_one_and_update( {"title": "Mr."}, {"$inc": {"rank": 3}}, projection={"title": True, "rank": True}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'a80106f2-...', 'title': 'Mr.', 'rank': 3} collection.find_one_and_update( {"name": "Johnny"}, {"$set": {"rank": 0}}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # (returns None for no matches) collection.find_one_and_update( {"name": "Johnny"}, {"$set": {"rank": 0}}, upsert=True, return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'cb4ef2ab-...', 'name': 'Johnny', 'rank': 0} - TypeScript
-
View this topic in more detail on the API Reference.
const docBefore = await collection.findOneAndUpdate( { $and: [{ name: 'Jesse' }, { gender: 'M' }] }, { $set: { title: 'Mr.' } }, { returnDocument: 'before' }, );Locate and update a document, returning the document itself, creating a new one if nothing is found.
const docBefore = await collection.findOneAndUpdate( { $and: [{ name: 'Jesse' }, { gender: 'M' }] }, { $set: { title: 'Mr.' } }, { upsert: true, returnDocument: 'before' }, );Parameters:
Name Type Summary filter
A filter to select the document to update.
update
The update to apply to the selected document.
options
The options for this operation.
Options (
FindOneAndUpdateOptions):Name Type Summary 'before' | 'after'Specifies whether to return the original or updated document.
booleanIf true, creates a new document if no document matches the filter.
Specifies which fields should be included/excluded in the returned documents. Defaults to including all fields.
When specifying a projection, it’s the user’s responsibility to handle the return type carefully. Consider type-casting.
Can only be used when performing a vector search.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself. The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
booleanWhen true, returns alongside the document, an ok field with a value of 1 if the command executed successfully.
Returns:
Promise<WithId<Schema> | null>- The document before/after the update, depending on the type ofreturnDocument, ornullif no matches are found.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert a document await collection.insertOne({ 'Marco': 'Polo' }); // Prints 'Mr.' const updated1 = await collection.findOneAndUpdate( { 'Marco': 'Polo' }, { $set: { title: 'Mr.' } }, { returnDocument: 'after' }, ); console.log(updated1?.title); // Prints { _id: ..., title: 'Mr.', rank: 3 } const updated2 = await collection.findOneAndUpdate( { title: 'Mr.' }, { $inc: { rank: 3 } }, { projection: { title: 1, rank: 1 }, returnDocument: 'after' }, ); console.log(updated2); // Prints null const updated3 = await collection.findOneAndUpdate( { name: 'Johnny' }, { $set: { rank: 0 } }, { returnDocument: 'after' }, ); console.log(updated3); // Prints { _id: ..., name: 'Johnny', rank: 0 } const updated4 = await collection.findOneAndUpdate( { name: 'Johnny' }, { $set: { rank: 0 } }, { upsert: true, returnDocument: 'after' }, ); console.log(updated4); })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous Optional<T> findOneAndUpdate(Filter filter, Update update); // Synchronous CompletableFuture<Optional<T>> findOneAndUpdateAsync(Filter filter, Update update);Returns:
[
Optional<T>] - Return the working document matching the filter orOptional.empty()if no document is found.Parameters:
Name Type Summary filterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
updateSet the different options for the
findoperation. The options are asortclause, someprojectionto retrieve sub parts of the documents and a flag to include the similarity in case of a vector search.What you need to know:
To build the different parts of the requests a set of helper classes are provided suffixed by a
slikeFiltersforFilter.Update is no different and you can leverage the class
Updates.Update update = Updates .set("field1", "value1") .inc("field2", 1d) .unset("field3");Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.Update; import com.datastax.astra.client.model.Updates; import java.util.Optional; import static com.datastax.astra.client.model.Filters.lt; public class FindOneAndUpdate { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // Building the update Update update = Updates.set("field1", "value1") .inc("field2", 1d) .unset("field3"); Optional<Document> doc = collection.findOneAndUpdate(filter, update); } }
Update a document
Update a single document on the collection as requested.
- Python
-
View this topic in more detail on the API Reference.
update_result = collection.update_one( {"_id": 456}, {"$set": {"name": "John Smith"}}, )Update a single document on the collection, inserting a new one if no match is found.
update_result = collection.update_one( {"_id": 456}, {"$set": {"name": "John Smith"}}, upsert=True, )Returns:
UpdateResult- An object representing the response from the database after the update operation. It includes information about the operation.Example responseUpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'name': 'John Doe'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})Parameters:
Name Type Summary filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, approximate nearest neighbor (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the
sortexamples for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.update_one({"Marco": {"$exists": True}}, {"$inc": {"rank": 3}}) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}) collection.update_one({"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}}) # prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.update_one( {"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}}, upsert=True, ) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '2a45ff60-...'}) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.updateOne( { $and: [{ name: 'Jesse' }, { gender: 'M' }] }, { $set: { title: 'Mr.' } }, );Update a single document on the collection, inserting a new one if no match is found.
const result = await collection.updateOne( { $and: [{ name: 'Jesse' }, { gender: 'M' }] }, { $set: { title: 'Mr.' } }, { upsert: true }, );Parameters:
Name Type Summary filter
A filter to select the document to update.
update
The update to apply to the selected document.
options?
The options for this operation.
Options (
UpdateOneOptions):Name Type Summary booleanIf true, creates a new document if no document matches the filter.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
Returns:
Promise<UpdateOneResult<Schema>>- The result of the update operation.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert a document await collection.insertOne({ 'Marco': 'Polo' }); // Prints 1 const updated1 = await collection.updateOne( { 'Marco': 'Polo' }, { $set: { title: 'Mr.' } }, ); console.log(updated1?.modifiedCount); // Prints 0 0 const updated2 = await collection.updateOne( { name: 'Johnny' }, { $set: { rank: 0 } }, ); console.log(updated2.matchedCount, updated2?.upsertedCount); // Prints 0 1 const updated3 = await collection.updateOne( { name: 'Johnny' }, { $set: { rank: 0 } }, { upsert: true }, ); console.log(updated3.matchedCount, updated3?.upsertedCount); })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous UpdateResult updateOne(Filter filter, Update update); // Asynchronous CompletableFuture<UpdateResult<T>> updateOneAsync(Filter filter, Update update);Returns:
UpdateResults<T>- Result of the operation with the number of documents matched (matchedCount) and updated (modifiedCount)Parameters:
Name Type Summary filterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
updateSet the different options for the
findoperation. The options are asortclause, someprojectionto retrieve sub parts of the documents and a flag to include the similarity in case of a vector search.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.Update; import com.datastax.astra.client.model.UpdateResult; import com.datastax.astra.client.model.Updates; import java.util.Optional; import static com.datastax.astra.client.model.Filters.lt; public class UpdateOne { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // Building the update Update update = Updates.set("field1", "value1") .inc("field2", 1d) .unset("field3"); UpdateResult result = collection.updateOne(filter, update); }
Update multiple documents
Update multiple documents in a collection.
- Python
-
View this topic in more detail on the API Reference.
results = collection.update_many( {"name": {"$exists": False}}, {"$set": {"name": "unknown"}}, )Update multiple documents in a collection, inserting a new one if no matches are found.
results = collection.update_many( {"name": {"$exists": False}}, {"$set": {"name": "unknown"}}, upsert=True, )Returns:
UpdateResult- An object representing the response from the database after the update operation. It includes information about the operation.Example responseUpdateResult(raw_results=[{'status': {'matchedCount': 2, 'modifiedCount': 2}}], update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})Parameters:
Name Type Summary filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.upsert
boolThis parameter controls the behavior in absence of matches. If True, a single new document (resulting from applying
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default. You may need to increase the timeout duration when updating a large number of documents, as the update will require multiple HTTP requests in sequence.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many([{"c": "red"}, {"c": "green"}, {"c": "blue"}]) collection.update_many({"c": {"$ne": "green"}}, {"$set": {"nongreen": True}}) # prints: UpdateResult(raw_results=..., update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2}) collection.update_many({"c": "orange"}, {"$set": {"is_also_fruit": True}}) # prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.update_many( {"c": "orange"}, {"$set": {"is_also_fruit": True}}, upsert=True, ) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '46643050-...'}) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.updateMany( { name: { $exists: false } }, { $set: { title: 'unknown' } }, );Update multiple documents in a collection, inserting a new one if no matches are found.
const result = await collection.updateMany( { name: { $exists: false } }, { $set: { title: 'unknown' } }, { upsert: true }, );Parameters:
Name Type Summary filter
A filter to select the documents to update.
update
The update to apply to the selected documents.
options?
The options for this operation.
Options (
UpdateManyOptions):Name Type Summary booleanIf true, creates a new document if no document matches the filter.
numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
Returns:
Promise<UpdateManyResult<Schema>>- The result of the update operation.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some documents await collection.insertMany([{ c: 'red' }, { c: 'green' }, { c: 'blue' }]); // { modifiedCount: 2, matchedCount: 2, upsertedCount: 0 } await collection.updateMany({ c: { $ne: 'green' } }, { $set: { nongreen: true } }); // { modifiedCount: 0, matchedCount: 0, upsertedCount: 0 } await collection.updateMany({ c: 'orange' }, { $set: { is_also_fruit: true } }); // { modifiedCount: 0, matchedCount: 0, upsertedCount: 1, upsertedId: '...' } await collection.updateMany({ c: 'orange' }, { $set: { is_also_fruit: true } }, { upsert: true }); })(); - Java
-
Operations on documents are performed at
Collectionlevel, to get details on each signature you can access the Collection JavaDOC.Collection is a generic class, default type is
Documentbut you can specify your own type and the object will be serialized by Jackson.Most methods come with synchronous and asynchronous flavors where the asynchronous version will be suffixed by
Asyncand return aCompletableFuture.// Synchronous UpdateResult updateMany(Filter filter, Update update); UpdateResult updateMany(Filter filter, Update update, UpdateManyOptions); // Synchronous CompletableFuture<UpdateResult<T>> updateManyAsync(Filter filter, Update update); CompletableFuture<UpdateResult<T>> updateManyAsync(Filter filter, Update update, UpdateManyOptions);Returns:
UpdateResults<T>- Result of the operation with the number of documents matched (matchedCount) and updated (modifiedCount)Parameters:
Name Type Summary filterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
updateSet the different options for the
findoperation. The options are asortclause, someprojectionto retrieve sub parts of the documents and a flag to include the similarity in case of a vector search.optionsContains the options for update many here you can set the
upsertflag.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.Update; import com.datastax.astra.client.model.UpdateManyOptions; import com.datastax.astra.client.model.UpdateResult; import com.datastax.astra.client.model.Updates; import static com.datastax.astra.client.model.Filters.lt; public class UpdateMany { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); Update update = Updates.set("field1", "value1") .inc("field2", 1d) .unset("field3"); UpdateManyOptions options = new UpdateManyOptions().upsert(true); UpdateResult result = collection.updateMany(filter, update, options); } }
Find distinct values across documents
Get a list of the distinct values of a certain key in a collection.
- Python
-
View this topic in more detail on the API Reference.
collection.distinct("category")Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
collection.distinct( "food.allergies", filter={"registered_for_dinner": True}, )Returns:
List[Any]- A list of the distinct values encountered. Documents that lack the requested key are ignored.Example response['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]Parameters:
Name Type Summary key
strThe name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable
keyvalues:"field","field.subfield","field.3", and"field.3.subfield". If lists are encountered and no numeric index is specified, all items in the list are visited.filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.max_time_ms
Optional[int]A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default.
Keep in mind that
distinctis a client-side operation, which effectively browses all required documents using the logic of thefindmethod and collects the unique values found forkey. As such, there may be performance, latency and ultimately billing implications if the amount of matching documents is large.For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many( [ {"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"}, {"name": "Emma", "food": {"likes_fruit": True, "allergies": []}}, ] ) collection.distinct("name") # prints: ['Marco', 'Emma'] collection.distinct("city") # prints: ['Helsinki'] collection.distinct("food") # prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}] collection.distinct("food.1") # prints: ['orange'] collection.distinct("food.allergies") # prints: [] collection.distinct("food.likes_fruit") # prints: [True] - TypeScript
-
View this topic in more detail on the API Reference.
const unique = await collection.distinct('category');Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
const unique = await collection.distinct( 'food.allergies', { registeredForDinner: true }, );Parameters:
Name Type Summary key
stringThe name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable key values:
'field','field.subfield','field.3', and'field.3.subfield'. If lists are encountered and no numeric index is specified, all items in the list are visited.filter?
A filter to select the documents to use. If not provided, all documents will be used.
Returns:
Promise<Flatten<(SomeDoc & ToDotNotation<FoundDoc<Schema>>)[Key]>[]>- A promise which resolves to the unique distinct values.The return type is mostly accurate, but with complex keys, it may be required to manually cast the return type to the expected type.
Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some documents await collection.insertOne({ name: 'Marco', food: ['apple', 'orange'], city: 'Helsinki' }); await collection.insertOne({ name: 'Emma', food: { likes_fruit: true, allergies: [] } }); // ['Marco', 'Emma'] await collection.distinct('name') // ['Helsinki'] await collection.distinct('city') // ['apple', 'orange', { likes_fruit: true, allergies: [] }] await collection.distinct('food') // ['orange'] await collection.distinct('food.1') // [] await collection.distinct('food.allergies') // [true] await collection.distinct('food.likes_fruit') })(); - Java
-
Gets the distinct values of the specified field name.
// Synchronous DistinctIterable<T,F> distinct(String fieldName, Filter filter, Class<F> resultClass); DistinctIterable<T,F> distinct(String fieldName, Class<F> resultClass); // Asynchronous CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Filter filter, Class<F> resultClass); CompletableFuture<DistinctIterable<T,F>> distinctAsync(String fieldName, Class<F> resultClass);Returns:
DistinctIterable<F>- List of distinct values of the specified field name.Parameters:
Name Type Summary fieldNameStringThe name of the field on which project the value.
filterCriteria list to filter the document. The filter is a JSON object that can contain any valid Data API filter expression.
resultClassClassThe type of the field we are working on
Keep in mind that
distinctis a client-side operation, which effectively browses all required documents using the logic of thefindmethod and collects the unique values found forkey. As such, there may be performance, latency and ultimately billing implications if the amount of matching documents is large.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.DistinctIterable; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.FindIterable; import com.datastax.astra.client.model.FindOptions; import static com.datastax.astra.client.model.Filters.lt; import static com.datastax.astra.client.model.Projections.exclude; import static com.datastax.astra.client.model.Projections.include; public class Distinct { public static void main(String[] args) { // Given an existing collection Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // Execute a find operation DistinctIterable<Document, String> result = collection .distinct("field", String.class); DistinctIterable<Document, String> result2 = collection .distinct("field", filter, String.class); // Iterate over the result for (String fieldValue : result) { System.out.println(fieldValue); } } }
Count documents in a collection
Get the count of documents in a collection. Count all documents or apply filtering to count a subset of documents.
- Python
-
View this topic in more detail on the API Reference.
collection.count_documents({}, upper_bound=500)Get the count of the documents in a collection matching a condition.
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=50)Returns:
int- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.upper_bound
intA required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception is raised. An exception is also raised if the actual number of documents exceeds the maximum count that the Data API can reach, regardless of
upper_bound.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many([{"seq": i} for i in range(20)]) collection.count_documents({}, upper_bound=100) # prints: 20 collection.count_documents({"seq":{"$gt": 15}}, upper_bound=100) # prints: 4 collection.count_documents({}, upper_bound=10) # Raises: astrapy.exceptions.TooManyDocumentsToCountException - TypeScript
-
View this topic in more detail on the API Reference.
const numDocs = await collection.countDocuments({}, 500);Get the count of the documents in a collection matching a filter.
const numDocs = await collection.countDocuments({ seq: { $gt: 15 } }, 50);Parameters:
Name Type Summary filter
A filter to select the documents to count. If not provided, all documents will be counted.
upperBound
numberA required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception is raised. An exception is also raised if the actual number of documents exceeds the maximum count that the Data API can reach, regardless of
upperBound.options?
The options (the timeout) for this operation.
Returns:
Promise<number>- A promise that resolves to the exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound, in which case an exception is raised.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some documents await collection.insertMany(Array.from({ length: 20 }, (_, i) => ({ seq: i }))); // Prints 20 await collection.countDocuments({}, 100); // Prints 4 await collection.countDocuments({ seq: { $gt: 15 } }, 100); // Throws TooManyDocumentsToCountError await collection.countDocuments({}, 10); })(); - Java
-
// Synchronous int countDocuments(int upperBound) throws TooManyDocumentsToCountException; int countDocuments(Filter filter, int upperBound) throws TooManyDocumentsToCountException;Get the count of the documents in a collection matching a condition.
Returns:
int- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.Parameters:
Name Type Summary filter (optional)
FilterA predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.upperBound
intA required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception will be raised. Furthermore, if the actual number of documents exceeds the maximum count that the Data API can reach (regardless of upper_bound), an exception will be raised.
The checked exception
TooManyDocumentsToCountExceptionis raised when the actual number of documents exceeds the upper bound set by the caller or the API. This exception indicates that there are more matching documents beyond the count threshold.Consider modifying your conditions to count fewer documents at once. If you need to count large numbers of documents, consider using the Data API
estimatedDocumentCountcommand.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.exception.TooManyDocumentsToCountException; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import static com.datastax.astra.client.model.Filters.lt; public class CountDocuments { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); try { // Count with no filter collection.countDocuments(500); // Count with a filter collection.countDocuments(filter, 500); } catch(TooManyDocumentsToCountException tmde) { // Explicit error if the count is above the upper limit or above the 1000 limit } } }
Estimate document count in a collection
Get an approximate document count for an entire collection. Filtering isn’t supported.
|
In the |
- Python
-
View this topic in more detail on the API Reference.
collection.estimated_document_count()Returns:
int- A server-side estimate of the total number of documents in the collection.Parameters:
Name Type Summary max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_DB_API_ENDPOINT("01234567-...") collection = database.my_collection collection.estimated_document_count() # prints: 37500 - TypeScript
-
View this topic in more detail on the API Reference.
const estNumDocs = await collection.estimatedDocumentCount();Parameters:
Name Type Summary options?
The options (the timeout) for this operation.
Returns:
Promise<number>- A promise that resolves to a server-side estimate of the total number of documents in the collection.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { console.log(await collection.estimatedDocumentCount()); })(); - Java
-
View this topic in more detail on the API Reference.
long estimatedDocumentCount(); long estimatedDocumentCount(EstimatedCountDocumentsOptions options);Parameters:
Name Type Summary options?
Set different options for the
estimatedDocumentCountoperation, such astimeoutandhttpSettings.Returns:
long- A server-side estimate of the total number of documents in the collection. This estimate is built from the SSTable files.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.exception.TooManyDocumentsToCountException; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.EstimatedCountDocumentsOptions; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.internal.command.LoggingCommandObserver; import static com.datastax.astra.client.model.Filters.lt; public class EstimateCountDocuments { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Count with no filter long estimatedCount = collection.estimatedDocumentCount(); // Count with options (adding a logger) EstimatedCountDocumentsOptions options = new EstimatedCountDocumentsOptions() .registerObserver("logger", new LoggingCommandObserver(DataAPIClient.class)); long estimateCount2 = collection.estimatedDocumentCount(options); } }
Find and replace a document
Locate a document matching a filter condition and replace it with a new document, returning the document itself.
- Python
-
View this topic in more detail on the API Reference.
collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, )Locate and replace a document, returning the document itself, additionally creating it if nothing is found.
collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, upsert=True, )Returns:
Dict[str, Any]- The document that was found, either before or after the replacement (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response{'_id': 'rule1', 'text': 'all animals are equal'}Parameters:
Name Type Summary filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.replacement
Dict[str, Any]the new document to write into the collection.
projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the document being returned. The projection can be: an iterable over the included field names; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to exclude specific fields from the response. Special document fields (e.g.
_id,$vector) are controlled individually. The default projection does not necessarily include all fields of the document. See theprojectionexamples for more on this parameter.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, approximate nearest neighbor (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the
sortexamples for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True,
replacementis inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.return_document
strA flag controlling what document is returned: if set to
ReturnDocument.BEFORE, or the string "before", the document found on database is returned; if set toReturnDocument.AFTER, or the string "after", the new document is returned. The default is "before".max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection import astrapy collection.insert_one({"_id": "rule1", "text": "all animals are equal"}) collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, ) # prints: {'_id': 'rule1', 'text': 'all animals are equal'} collection.find_one_and_replace( {"text": "some animals are more equal!"}, {"text": "and the pigs are the rulers"}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'rule1', 'text': 'and the pigs are the rulers'} collection.find_one_and_replace( {"_id": "rule2"}, {"text": "F=ma^2"}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # (returns None for no matches) collection.find_one_and_replace( {"_id": "rule2"}, {"text": "F=ma"}, upsert=True, return_document=astrapy.constants.ReturnDocument.AFTER, projection={"_id": False}, ) # prints: {'text': 'F=ma'} - TypeScript
-
View this topic in more detail on the API Reference.
const docBefore = await collection.findOneAndReplace( { _id: 123 }, { text: 'some animals are more equal!' }, { returnDocument: 'before' }, );Locate and replace a document, returning the document itself, additionally creating it if nothing is found.
const docBefore = await collection.findOneAndReplace( { _id: 123 }, { text: 'some animals are more equal!' }, { returnDocument: 'before', upsert: true }, );Parameters:
Name Type Summary filter
A filter to select the document to replace.
replacement
The replacement document, which contains no _id field.
options
The options for this operation.
Options (
FindOneAndReplaceOptions):Name Type Summary 'before' | 'after'Specifies whether to return the original or replaced document.
booleanIf true, creates a new document if no document matches the filter.
Specifies which fields should be included/excluded in the returned documents. Defaults to including all fields.
When specifying a projection, it’s the user’s responsibility to handle the return type carefully. Consider type-casting.
Can only be used when performing a vector search.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
booleanWhen true, returns alongside the document, an ok field with a value of 1 if the command executed successfully.
Returns:
Promise<WithId<Schema> | null>- The document before/after the update, depending on the type ofreturnDocument, ornullif no matches are found.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.insertOne({ _id: "rule1", text: "all animals are equal" }); // { _id: 'rule1', text: 'all animals are equal' } await collection.findOneAndReplace( { _id: "rule1" }, { text: "some animals are more equal!" }, { returnDocument: 'before' } ); // { _id: 'rule1', text: 'and the pigs are the rulers' } await collection.findOneAndReplace( { text: "some animals are more equal!" }, { text: "and the pigs are the rulers" }, { returnDocument: 'after' } ); // null await collection.findOneAndReplace( { _id: "rule2" }, { text: "F=ma^2" }, { returnDocument: 'after' } ); // { text: 'F=ma' } await collection.findOneAndReplace( { _id: "rule2" }, { text: "F=ma" }, { upsert: true, returnDocument: 'after', projection: { _id: false } } ); })(); - Java
-
// Synchronous Optional<T> findOneAndReplace(Filter filter, T replacement); Optional<T> findOneAndReplace(Filter filter, T replacement, FindOneAndReplaceOptions options); // Asynchronous CompletableFuture<Optional<T>> findOneAndReplaceAsync(Filter filter, T replacement); CompletableFuture<Optional<T>> findOneAndReplaceAsync(Filter filter, T replacement, FindOneAndReplaceOptions options);Returns:
Optional<T>- Return the a document that matches the filter. WhetherreturnDocumentis set to before or after it will return the document before or after update accordingly.Parameters:
Name Type Summary filter (optional)
FilterA predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.replacement
TThis is the document that will replace the existing one if exist. It flag
upsertis set to true and no document is found, this document will be inserted.options(optional)
Provide list of options for findOneAndReplace operation as a
Sortclause (sort on vector or any other field) or aProjectionclause, upsert flag andreturnDocumentflag.Sample definition of
FindOneAndReplaceOptions:FindOneAndReplaceOptions options = FindOneAndReplaceOptions.Builder .projection(Projections.include("field1")) .sort(Sorts.ascending("field1")) .upsert(true) .returnDocumentAfter();Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.FindOneAndReplaceOptions; import com.datastax.astra.client.model.Projections; import com.datastax.astra.client.model.Sorts; import java.util.Optional; import static com.datastax.astra.client.model.Filters.lt; public class FindOneAndReplace { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); FindOneAndReplaceOptions options = new FindOneAndReplaceOptions() .projection(Projections.include("field1")) .sort(Sorts.ascending("field1")) .upsert(true) .returnDocumentAfter(); Document docForReplacement = new Document() .append("field1", "value1") .append("field2", 20) .append("field3", 30) .append("field4", "value4"); // It will return the document before deleting it Optional<Document> docBeforeReplace = collection .findOneAndReplace(filter, docForReplacement, options); } }
Replace a document
Replace a document in the collection with a new one.
- Python
-
View this topic in more detail on the API Reference.
replace_result = collection.replace_one( {"Marco": {"$exists": True}}, {"Buda": "Pest"}, )Replace a document in the collection with a new one, creating a new one if no match is found.
replace_result = collection.replace_one( {"Marco": {"$exists": True}}, {"Buda": "Pest"}, upsert=True, )Returns:
UpdateResult- An object representing the response from the database after the replace operation. It includes information about the operation.Example responseUpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'Marco': 'Polo'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})Parameters:
Name Type Summary filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.replacement
Dict[str, Any]the new document to write into the collection.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, approximate nearest neighbor (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the
sortexamples for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True,
replacementis inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.replace_one({"Marco": {"$exists": True}}, {"Buda": "Pest"}) prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}) collection.find_one({"Buda": "Pest"}) prints: {'_id': '8424905a-...', 'Buda': 'Pest'} collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}) prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}, upsert=True) prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '931b47d6-...'}) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.replaceOne( { 'Marco': 'Polo' }, { 'Buda': 'Pest' }, );Replace a document in the collection with a new one, creating a new one if no match is found.
const result = await collection.replaceOne( { 'Marco': 'Polo' }, { 'Buda': 'Pest' }, { upsert: true }, );Parameters:
Name Type Summary filter
A filter to select the document to replace.
replacement
The replacement document, which contains no _id field.
options?
The options for this operation.
Options (
ReplaceOneOptions):Name Type Summary booleanIf true, creates a new document if no document matches the filter.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
Returns:
Promise<ReplaceOneResult<Schema>>- The result of the replacement operation.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.insertOne({ 'Marco': 'Polo' }); // { modifiedCount: 1, matchedCount: 1, upsertedCount: 0 } await collection.replaceOne( { 'Marco': { '$exists': true } }, { 'Buda': 'Pest' } ); // { _id: '3756ce75-aaf1-430d-96ce-75aaf1730dd3', Buda: 'Pest' } await collection.findOne({ 'Buda': 'Pest' }); // { modifiedCount: 0, matchedCount: 0, upsertedCount: 0 } await collection.replaceOne( { 'Mirco': { '$exists': true } }, { 'Oh': 'yeah?' } ); // { modifiedCount: 0, matchedCount: 0, upsertedId: '...', upsertedCount: 1 } await collection.replaceOne( { 'Mirco': { '$exists': true } }, { 'Oh': 'yeah?' }, { upsert: true } ); })(); - Java
-
// Synchronous UpdateResult replaceOne(Filter filter, T replacement); UpdateResult replaceOne(Filter filter, T replacement, ReplaceOneOptions options); // Asynchronous CompletableFuture<UpdateResult> replaceOneAsync(Filter filter, T replacement); CompletableFuture<UpdateResult> replaceOneAsync(Filter filter, T replacement, ReplaceOneOptions options);Returns:
UpdateResult - Return a wrapper object with the result of the operation. The object contains the number of documents matched (
matchedCount) and updated (modifiedCount)Parameters:
Name Type Summary filter (optional)
FilterA predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.replacement
TThis is the document that will replace the existing one if exist. It flag
upsertis set to true and no document is found, this document will be inserted.options(optional)
Provide list of options for
replaceOne()operation and especially theupsertflag.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.FindOneAndReplaceOptions; import com.datastax.astra.client.model.Projections; import com.datastax.astra.client.model.Sorts; import java.util.Optional; import static com.datastax.astra.client.model.Filters.lt; public class FindOneAndReplace { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); FindOneAndReplaceOptions options = new FindOneAndReplaceOptions() .projection(Projections.include("field1")) .sort(Sorts.ascending("field1")) .upsert(true) .returnDocumentAfter(); Document docForReplacement = new Document() .append("field1", "value1") .append("field2", 20) .append("field3", 30) .append("field4", "value4"); // It will return the document before deleting it Optional<Document> docBeforeReplace = collection .findOneAndReplace(filter, docForReplacement, options); } }
Find and delete a document
Locate a document matching a filter condition and delete it, returning the document itself.
- Python
-
View this topic in more detail on the API Reference.
collection.find_one_and_delete({"status": "stale_entry"})Returns:
Dict[str, Any]- The document that was just deleted (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response{'_id': 199, 'status': 'stale_entry', 'request_id': 'A4431'}Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the included field names; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to exclude specific fields from the response. Special document fields (e.g.
_id,$vector) are controlled individually. The default projection does not necessarily include all fields of the document. See theprojectionexamples for more on this parameter.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, approximate nearest neighbor (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the
sortexamples for more on sorting.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many( [ {"species": "swan", "class": "Aves"}, {"species": "frog", "class": "Amphibia"}, ], ) collection.find_one_and_delete( {"species": {"$ne": "frog"}}, projection={"species": True}, ) # prints: {'_id': '5997fb48-...', 'species': 'swan'} collection.find_one_and_delete({"species": {"$ne": "frog"}}) # (returns None for no matches) - TypeScript
-
View this topic in more detail on the API Reference.
const deletedDoc = await collection.findOneAndDelete({ status: 'stale_entry' });Parameters:
Name Type Summary filter
A filter to select the document to delete.
options?
The options for this operation.
Options (
FindOneAndDeleteOptions):Name Type Summary Specifies which fields should be included/excluded in the returned documents. Defaults to including all fields.
When specifying a projection, it’s the user’s responsibility to handle the return type carefully. Consider type-casting.
Can only be used when performing a vector search.
Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
booleanWhen true, returns alongside the document, an ok field with a value of 1 if the command executed successfully.
Returns:
Promise<WithId<Schema> | null>- The document that was deleted, ornullif no matches are found.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.insertMany([ { species: 'swan', class: 'Aves' }, { species: 'frog', class: 'Amphibia' }, ]); // { _id: '...', species: 'swan' } await collection.findOneAndDelete( { species: { $ne: 'frog' } }, { projection: { species: 1 } }, ); // null await collection.findOneAndDelete( { species: { $ne: 'frog' } }, ); })(); - Java
-
// Synchronous Optional<T> findOneAndDelete(Filter filter); Optional<T> findOneAndDelete(Filter filter, FindOneAndDeleteOptions options); // Asynchronous CompletableFuture<Optional<T>> findOneAndDeleteAsync(Filter filter); CompletableFuture<Optional<T>> findOneAndDeleteAsync(Filter filter, FindOneAndDeleteOptions options);Returns:
DeleteResult- Wrapper that contains the deleted count.Parameters:
Name Type Summary filter (optional)
FilterA predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.options(optional)
Provide list of options a delete one such as a
Sortclause (sort on vector or any other field) or aProjectionclauseExample:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import java.util.Optional; import static com.datastax.astra.client.model.Filters.lt; public class FindOneAndDelete { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Building a filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // It will return the document before deleting it Optional<Document> docBeforeRelease = collection.findOneAndDelete(filter); } }
Delete a document
Locate and delete a single document from a collection.
- Python
-
View this topic in more detail on the API Reference.
response = collection.delete_one({ "_id": "1" })Locate and delete a single document from a collection by any attribute (as long as it is covered by the collection’s indexing configuration).
document = collection.delete_one({"location": "warehouse_C"})Locate and delete a single document from a collection by an arbitrary filtering clause.
document = collection.delete_one({"tag": {"$exists": True}})Delete the most similar document to a given vector.
result = collection.delete_one({}, vector=[.12, .52, .32])Generate a vector from a string and delete the most similar document.
result = collection.delete_one({}, vectorize="Text to vectorize")Returns:
DeleteResult- An object representing the response from the database after the delete operation. It includes information about the success of the operation.Example responseDeleteResult(raw_results=[{'status': {'deletedCount': 1}}], deleted_count=1)Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, approximate nearest neighbor (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.vectorize
Optional[str]A string to be vectorized and used as the sorting criterion in a vector search. This parameter cannot be used together with
sort. See thesortexamples for more on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the
sortexamples for more on sorting.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. This method uses the collection-level timeout by default.
Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}]) collection.delete_one({"seq": 1}) # prints: DeleteResult(raw_results=..., deleted_count=1) collection.distinct("seq") # prints: [0, 2] collection.delete_one( {"seq": {"$exists": True}}, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) # prints: DeleteResult(raw_results=..., deleted_count=1) collection.distinct("seq") # prints: [0] collection.delete_one({"seq": 2}) # prints: DeleteResult(raw_results=..., deleted_count=0) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.deleteOne({ _id: '1' });Locate and delete a single document from a collection.
const result = await collection.deleteOne({ location: 'warehouse_C' });Locate and delete a single document from a collection by an arbitrary filtering clause.
const result = await collection.deleteOne({ tag: { $exists: true } });Delete the most similar document to a given vector.
const result = await collection.deleteOne({}, { vector: [.12, .52, .32] });Generate a vector from a string and delete the most similar document.
const result = await collection.deleteOne({}, { vectorize: 'Text to vectorize' });Parameters:
Name Type Summary filter
A filter to select the document to delete.
options?
The options for this operation.
Options (
DeleteOneOptions):Name Type Summary Specifies the order in which the documents are returned. Defaults to the order in which the documents are stored on disk.
number[]An optional vector to use to perform a vector search on the collection to find the closest matching document.
Equivalent to setting the
$vectorfield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorfield in the sort object directly.stringA string to be vectorized and used as the sorting criterion in a vector search.
Equivalent to setting the
$vectorizefield in thesortfield itself—The two are interchangeable, but mutually exclusive.If you really need to use both, you can set the
$vectorizefield in the sort object directly.numberThe maximum time in milliseconds that the client should wait for the operation to complete for each single one of the underlying HTTP requests.
Returns:
Promise<DeleteOneResult>- The result of the deletion operation.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.insertMany([{ seq: 1 }, { seq: 0 }, { seq: 2 }]); // { deletedCount: 1 } await collection.deleteOne({ seq: 1 }); // [0, 2] await collection.distinct('seq'); // { deletedCount: 1 } await collection.deleteOne({ seq: { $exists: true } }, { sort: { seq: -1 } }); // [0] await collection.distinct('seq'); // { deletedCount: 0 } await collection.deleteOne({ seq: 2 }); })(); - Java
-
// Synchronous DeleteResult deleteOne(Filter filter); DeleteResult deleteOne(Filter filter, DeleteOneOptions options); // Asynchronous CompletableFuture<DeleteResult> deleteOneAsync(Filter filter); CompletableFuture<DeleteResult> deleteOneAsync(Filter filter, DeleteOneOptions options);Returns:
DeleteResult- Wrapper that contains the deleted count.Parameters:
Name Type Summary filter (optional)
FilterA predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators.options(optional)
Provide list of options a delete one such as a
Sortclause (sort on vector or any other field)Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.DeleteOneOptions; import com.datastax.astra.client.model.DeleteResult; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import com.datastax.astra.client.model.Sorts; import static com.datastax.astra.client.model.Filters.lt; public class DeleteOne { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Sample Filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); // Delete one options DeleteOneOptions options = new DeleteOneOptions() .sort(Sorts.ascending("field2")); DeleteResult result = collection.deleteOne(filter, options); System.out.println("Deleted Count:" + result.getDeletedCount()); } }
Delete documents
Delete multiple documents from a collection.
- Python
-
View this topic in more detail on the API Reference.
delete_result = collection.delete_many({"status": "processed"})Returns:
DeleteResult- An object representing the response from the database after the delete operation. It includes information about the success of the operation.Example responseDeleteResult(raw_results=[{'status': {'deletedCount': 2}}], deleted_count=2)Parameters:
Name Type Summary filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$lt": 100}},{"$and": [{"name": "John"}, {"price": {"$lt": 100}}]}. See Data API operators for the full list of operators. Thedelete_manymethod does not accept an empty filter: seedelete_allto completely erase all contents of a collectionmax_time_ms
Optional[int]A timeout, in milliseconds, for the operation. This method uses the collection-level timeout by default. You may need to increase the timeout duration when deleting a large number of documents, as the operation will require multiple HTTP requests in sequence.
This method would not admit an empty filter clause: use the
delete_allmethod to delete all documents in the collection.Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}]) collection.delete_many({"seq": {"$lte": 1}}) # prints: DeleteResult(raw_results=..., deleted_count=2) collection.distinct("seq") # prints: [2] collection.delete_many({"seq": {"$lte": 1}}) # prints: DeleteResult(raw_results=..., deleted_count=0) - TypeScript
-
View this topic in more detail on the API Reference.
const result = await collection.deleteMany({ status: 'processed' });Parameters:
Name Type Summary filter
A filter to select the document to delete.
options?
The options (the timeout) for this operation.
This method does not admit an empty filter clause; use the
deleteAllmethod to delete all documents in the collection.Returns:
Promise<DeleteManyResult>- The result of the deletion operation.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.insertMany([{ seq: 1 }, { seq: 0 }, { seq: 2 }]); // { deletedCount: 1 } await collection.deleteMany({ seq: { $lte: 1 } }); // [2] await collection.distinct('seq'); // { deletedCount: 0 } await collection.deleteMany({ seq: { $lte: 1 } }); })(); - Java
-
// Synchronous DeleteResult deleteMany(Filter filter); // Asynchronous CompletableFuture<DeleteResult> deleteManyAsync(Filter filter);Returns:
DeleteResult- Wrapper that contains the deleted count.Same as a few other methods the delete operation can delete only 20 documents at a time.
deleteMany()can takes time as we iterate until we got confirmation no more documents matching the filter are available.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.DeleteResult; import com.datastax.astra.client.model.Document; import com.datastax.astra.client.model.Filter; import com.datastax.astra.client.model.Filters; import static com.datastax.astra.client.model.Filters.lt; public class DeleteMany { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Sample Filter Filter filter = Filters.and( Filters.gt("field2", 10), lt("field3", 20), Filters.eq("field4", "value")); DeleteResult result = collection.deleteMany(filter); System.out.println("Deleted Count:" + result.getDeletedCount()); } }
Execute multiple write operations
Execute a (reusable) list of write operations on a collection with a single command.
- Python
-
View this topic in more detail on the API Reference.
bw_results = collection.bulk_write( [ InsertMany([{"a": 1}, {"a": 2}]), ReplaceOne( {"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True, ), ], )Returns:
BulkWriteResult- A single object summarizing the whole list of requested operations. The keys in the map attributes of the result (when present) are the integer indices of the corresponding operation in therequestsiterable.Example responseBulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})Parameters:
Name Type Summary requests
Iterable[BaseOperation]An iterable over concrete subclasses of
BaseOperation, such asInsertManyorReplaceOne. Each such object represents an operation ready to be executed on a collection, and is instantiated by passing the same parameters as one would the corresponding collection method.ordered
boolWhether to launch the
requestsone after the other or in arbitrary order, possibly in a concurrent fashion. DataStax recommendsFalse(default) when possible for faster performance.concurrency
Optional[int]Maximum number of concurrent operations executing at a given time. It cannot be more than one for ordered bulk writes.
max_time_ms
Optional[int]A timeout, in milliseconds, for the whole bulk write. This method uses the collection-level timeout by default. You may need to increase the timeout duration depending on the number of operations. If the method call times out, there’s no guarantee about how much of the bulk write was completed.
Example:
from astrapy import DataAPIClient from astrapy.operations import ( InsertOne, InsertMany, UpdateOne, UpdateMany, ReplaceOne, DeleteOne, DeleteMany, ) client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection op1 = InsertMany([{"a": 1}, {"a": 2}]) op2 = ReplaceOne({"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True) collection.bulk_write([op1, op2]) # prints: BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'}) collection.count_documents({}, upper_bound=100) # prints: 3 collection.distinct("replaced") # prints: [True] - TypeScript
-
View this topic in more detail on the API Reference.
const results = await collection.bulkWrite([ { insertOne: { a: '1' } }, { insertOne: { a: '2' } }, { replaceOne: { z: '9' }, replacement: { z: '9', replaced: true }, upsert: true }, ]);Parameters:
Name Type Summary operations
The operations to perform.
options?
The options for this operation.
Options (
BulkWriteOptions):Name Type Summary booleanYou may set the
orderedoption totrueto stop the operation after the first error; otherwise all operations may be parallelized and processed in arbitrary order, improving, perhaps vastly, performance.numberYou can set the
concurrencyoption to control how many network requests are made in parallel on unordered operations. Defaults to8.Not available for ordered operations.
numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Returns:
Promise<BulkWriteResult<Schema>>- A promise that resolves to a summary of the performed operations.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.bulkWrite([ { insertOne: { document: { a: 1 } } }, { insertOne: { document: { a: 2 } } }, { replaceOne: { filter: { z: 9 }, replacement: { z: 9, replaced: true }, upsert: true } }, ]); // 3 await collection.countDocuments({}, 100); // [true] await collection.distinct('replaced'); })(); - Java
-
// Synchronous BulkWriteResult bulkWrite(List<Command> commands); BulkWriteResult bulkWrite(List<Command> commands, BulkWriteOptions options); // Asynchronous CompletableFuture<BulkWriteResult> bulkWriteAsync(List<Command> commands); CompletableFuture<BulkWriteResult> bulkWriteAsync(List<Command> commands, BulkWriteOptions options);Returns:
BulkWriteResult- Wrapper with the list of responses for each command.Parameters:
Name Type Summary commands
List of the generic
Commandto execute.options(optional)
Provide list of options for those commands like
orderedorconcurrency.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.BulkWriteOptions; import com.datastax.astra.client.model.BulkWriteResult; import com.datastax.astra.client.model.Command; import com.datastax.astra.client.model.Document; import com.datastax.astra.internal.api.ApiResponse; import java.util.List; public class BulkWrite { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Set a couple of Commands Command cmd1 = Command.create("insertOne").withDocument(new Document().id(1).append("name", "hello")); Command cmd2 = Command.create("insertOne").withDocument(new Document().id(2).append("name", "hello")); // Set the options for the bulk write BulkWriteOptions options1 = BulkWriteOptions.Builder.ordered(false).concurrency(1); // Execute the queries BulkWriteResult result = collection.bulkWrite(List.of(cmd1, cmd2), options1); // Retrieve the LIST of responses for(ApiResponse res : result.getResponses()) { System.out.println(res.getData()); } } }
Delete all documents from a collection
Delete all documents in a collection.
- Python
-
View this topic in more detail on the API Reference.
result = collection.delete_all()Returns:
Dict- A dictionary in the form{"ok": 1}if the method succeeds.Parameters:
Name Type Summary max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request. If not passed, the collection-level setting is used instead.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database("DB_API_ENDPOINT") collection = database.my_collection my_coll.distinct("seq") # prints: [2, 1, 0] my_coll.count_documents({}, upper_bound=100) # prints: 4 my_coll.delete_all() # prints: {'ok': 1} my_coll.count_documents({}, upper_bound=100) # prints: 0 - TypeScript
-
View this topic in more detail on the API Reference.
const results = await collection.bulkWrite([ { insertOne: { a: '1' } }, { insertOne: { a: '2' } }, { replaceOne: { z: '9' }, replacement: { z: '9', replaced: true }, upsert: true }, ]);Parameters:
Name Type Summary operations
The operations to perform.
options?
The options for this operation.
Options (
BulkWriteOptions):Name Type Summary booleanYou may set the
orderedoption totrueto stop the operation after the first error; otherwise all operations may be parallelized and processed in arbitrary order, improving, perhaps vastly, performance.numberYou can set the
concurrencyoption to control how many network requests are made in parallel on unordered operations. Defaults to8.Not available for ordered operations.
numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Returns:
Promise<BulkWriteResult<Schema>>- A promise that resolves to a summary of the performed operations.Example:
import { DataAPIClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const client = new DataAPIClient('TOKEN'); const db = client.db('DB_API_ENDPOINT', { keyspace: 'DB_KEYSPACE' }); const collection = db.collection('COLLECTION'); (async function () { // Insert some document await collection.bulkWrite([ { insertOne: { document: { a: 1 } } }, { insertOne: { document: { a: 2 } } }, { replaceOne: { filter: { z: 9 }, replacement: { z: 9, replaced: true }, upsert: true } }, ]); // 3 await collection.countDocuments({}, 100); // [true] await collection.distinct('replaced'); })(); - Java
-
// Synchronous DeleteResult deleteAll(); // Asynchronous CompletableFuture<DeleteResult> deleteAllAsync();Returns:
DeleteResult- Wrapper that contains the deleted count.Same as a few other methods, the delete operation can delete only 20 documents at a time. To implement a
deleteAll(), execute adeleteMany()without any filter. This operation can takes time while iterating until the client receives confirmation that no more documents are available.Example:
package com.datastax.astra.client.collection; import com.datastax.astra.client.Collection; import com.datastax.astra.client.DataAPIClient; import com.datastax.astra.client.model.DeleteResult; import com.datastax.astra.client.model.Document; public class DeleteAll { public static void main(String[] args) { Collection<Document> collection = new DataAPIClient("TOKEN") .getDatabase("API_ENDPOINT") .getCollection("COLLECTION_NAME"); // Show the deleted count DeleteResult result = collection.deleteAll(); System.out.println("Deleted Count:" + result.getDeletedCount()); } }