Find documents
Finds documents in a collection using filter and sort clauses.
Method signature
-
Python
-
TypeScript
-
Java
-
curl
collection.find(
filter: Dict[str, Any],
*,
projection: Iterable[str] | Dict[str, bool],
skip: int,
limit: int,
include_similarity: bool,
include_sort_vector: bool,
sort: Dict[str, Any],
max_time_ms: int,
) -> Cursor
collection.find(
filter: Filter<Schema>,
options?: {
sort?: Sort,
projection?: Projection,
limit?: number,
skip?: number
includeSimilarity?: boolean,
includeSortVector?: boolean,
maxTimeMS?: number,
},
): FindCursor<FoundDoc<Schema>, FoundDoc<Schema>>
FindIterable<T> find(Filter filter, FindOptions options)
FindIterable<T> find()
FindIterable<T> find(Filter filter)
FindIterable<T> find(FindOptions options)
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/COLLECTION_NAME" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": FILTER,
"sort": SORT,
"projection": PROJECTION,
"options": {
"includeSimilarity": BOOLEAN,
"includeSortVector": BOOLEAN,
"skip": INTEGER,
"limit": INTEGER
}
}
}'
Result
-
Python
-
TypeScript
-
Java
-
curl
Returns a cursor (Cursor
) for iterating over documents that match the specified filter and sort clauses.
The fields included in the returned documents depend on the subset of fields that were requested in the projection.
If requested and applicable, each document will also include a $similarity
key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.
If requested when executing a vector search with vectorize, the result will also include the sort vector.
The cursor is compatible with for
loops.
You must iterate over the cursor to fetch matching documents.
The cursor transitions through the following statuses:
-
initialized
: no documents have been consumed -
running
: some but not all of the documents have been consumed -
exhausted
: all documents have been consumed
If you need a list of all results, you can call list()
on the cursor instead of iterating over the cursor. However, the time and memory required for this operation depend on the number of results.
Returns a cursor (FindCursor<FoundDoc<Schema>>
) for iterating over documents that match the specified filter and sort clauses.
The fields included in the returned documents depend on the subset of fields that were requested in the projection.
If requested and applicable, each document will also include a $similarity
key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.
If requested when executing a vector search with vectorize, the result will also include the sort vector.
The cursor is compatible with for
loops.
You must iterate over the cursor to fetch matching documents.
The cursor transitions through the following statuses:
. initialized
: no documents have been consumed
. running
: some but not all of the documents have been consumed
. exhausted
: all documents have been consumed
If you need a list of all results, you can call list()
on the cursor instead of iterating over the cursor. However, the time and memory required for this operation depend on the number of results.
Returns a cursor (FindIterable<T>
) for iterating over documents that match the specified filter and sort clauses.
The fields included in the returned documents depend on the subset of fields that were requested in the projection.
If requested and applicable, each document will also include a $similarity
key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.
If requested when executing a vector search with vectorize, the result will also include the sort vector.
The cursor is an Iterable
and is compatible with for
loops.
You must iterate over the cursor to fetch matching documents.
If you need a list of all results, you can use .all()
to exhaust the cursor. However, the time and memory required for this operation depend on the number of results.
The response includes a data.documents
property, which is an array of objects representing documents that match the specified filter and sort clauses.
The fields included in the returned documents depend on the subset of fields that were requested in the projection.
If requested and applicable, each document will also include a $similarity
key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.
If the query supports pagination, the response also includes a data.nextPageState
property, which indicates the ID of the next page of results, if any.
For non-vector searches, the results will be paginated if more than 20 documents match the specified filter and sort clauses.
For vector search (with $vector
or $vectorize
), returns a single page of up to 1000 documents (or a lower amount if specified) instead of a cursor.
If requested when executing a vector search with vectorize, the result also includes a status.sortVector
property, which is the sort vector used for the search.
Example response:
{
"data": {
"documents":[
{
"_id":"85a54382-9227-4075-a543-829227407556",
"title":"Within Silence of the Past",
"isCheckedOut":false
},
{
"_id":"aa762475-4fc1-4477-b624-754fc1f477c7",
"title":"Beyond Dreams and Forgotten Worlds",
"isCheckedOut":false
}
],
"nextPageState":"LQAAAAEBAAAAJGQ2OTk5NzY2LTgyODQtNDc3Mi05OTk3LTY2ODI4NGU3NzJjYQDwf///6wA="
}
}
Example response if no documents were found:
{
"data": {
"documents": [],
"nextPageState": null
}
}
Parameters
-
Python
-
TypeScript
-
Java
-
curl
Name | Type | Summary |
---|---|---|
|
|
Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. For a list of available filter operators and more examples, see Data API operators. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter. |
|
|
Optional. Controls which fields are included or excluded in the returned document. For more information, see Projection operations. Default: The default projection for the collection.
Certain fields, like |
|
|
Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search. |
|
|
Limit the total number of documents returned.
Once |
|
|
Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search. |
|
|
Optional. Whether the result should include the sort vector. This can be useful if you do a vector search with This parameter only applies if you use a vector search. Default: False |
|
|
Optional. Sorts documents by one or more fields, or performs a vector search. For more information, see Sort clauses. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort queries. For vector searches, this parameter can use either |
|
|
Optional. The maximum time, in milliseconds, that the client should wait for the underlying HTTP request. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the |
Name | Type | Summary |
---|---|---|
|
An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. For a list of available filter operators and more examples, see Data API operators. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter. |
|
|
Optional.
The options for this operation. See the |
Name | Type | Summary |
---|---|---|
Optional. Controls which fields are included or excluded in the returned document. For more information, see Projection operations. Default: The default projection for the collection.
Certain fields, like |
||
|
Optional.
Whether the response should include a This parameter only applies if you use a vector search. Default: False |
|
|
Optional. Whether the result should include the sort vector. This can be useful if you do a vector search with This parameter only applies if you use a vector search. Default: False |
|
Optional. Sorts documents by one or more fields, or performs a vector search. For more information, see Sort clauses. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort queries. For vector searches, this parameter can use either |
||
|
Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search. |
|
|
Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search. |
|
|
Optional. The maximum time, in milliseconds, that the client should wait for the underlying HTTP request. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the |
Name | Type | Summary |
---|---|---|
|
|
Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. For a list of available filter operators and more examples, see Data API operators. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter. |
|
Optional.
The options for this operation. See the methods of the |
Method | Parameters | Summary |
---|---|---|
|
|
Optional. Sorts documents by one or more fields, or performs a vector search. For more information, see Sort clauses. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort queries. For vector searches, this parameter can use either |
|
|
Optional. Controls which fields are included or excluded in the returned document. For more information, see Projection operations. Default: The default projection for the collection.
Certain fields, like |
|
|
Optional.
Whether the response should include a This parameter only applies if you use a vector search. Default: False |
|
|
Optional. Whether the result should include the sort vector. This can be useful if you do a vector search with This parameter only applies if you use a vector search. Default: False |
|
|
Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search. |
|
|
Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search. |
Use the find
command with these parameters:
Name | Type | Summary |
---|---|---|
|
|
Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. For a list of available filter operators and more examples, see Data API operators. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter. |
|
|
Optional. Sorts documents by one or more fields, or performs a vector search. For more information, see Sort clauses. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort queries. For vector searches, this parameter can use either |
|
|
Optional. Controls which fields are included or excluded in the returned document. For more information, see Projection operations. Default: The default projection for the collection.
Certain fields, like |
|
|
Optional.
The options for this operation. See the |
Name | Type | Summary |
---|---|---|
|
|
Optional.
Whether the response should include a This parameter only applies if you use a vector search. Default: False |
|
|
Optional. Whether the result should include the sort vector. This can be useful if you do a vector search with This parameter only applies if you use a vector search. Default: False |
|
|
Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search. |
|
|
Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search. |
|
|
Optional.
The value of |
Examples
The following examples demonstrate how to find documents in a collection.
Use filters to find documents
You can use a filter to find documents that match specific criteria.
For example, you can find documents with an isCheckedOut
value of false
and a numberOfPages
value less than 300.
For a list of available filter operators and more examples, see Data API operators.
Filters can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in a filter.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{
"$and": [
{"isCheckedOut": False},
{"numberOfPages": {"$lt": 300}},
]
}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find({
$and: [{ isCheckedOut: false }, { numberOfPages: { $lt: 300 } }],
});
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.and(
Filters.eq("isCheckedOut", false),
Filters.lt("numberOfPages", 300));
FindIterable<Document> cursor = collection.find(filter);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {"$and": [
{"isCheckedOut": false},
{"numberOfPages": {"$lt": 300}}
]}
}
}'
Use vector search to find documents
To find the documents whose $vector
value is most similar to a given vector, use a sort with the vector embeddings that you want to match. For more information, see Perform a vector search.
Vector search is only available for vector-enabled collections. For more information, see Vector and vectorize.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{},
sort={"$vector": [.12, .52, .32]}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{},
{ sort: { $vector: [.12, .52, .32] } }
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
FindOptions options = new FindOptions()
.sort(new float[] {0.12f, 0.52f, 0.32f});
FindIterable<Document> cursor = collection.find(options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vector": [.12, .52, .32] }
}
}'
Use vector search and vectorize to find documents
To find the document whose $vector
value is most similar to the $vector
value of a given search string, use a sort with the search string that you want to vectorize and match. For more information, see Perform a vector search.
Vector search with vectorize is only available for collections that have vectorize enabled. For more information, see Vector and vectorize.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{},
sort={"$vectorize": "Text to vectorize"}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{},
{ sort: { $vectorize: "Text to vectorize" } }
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
FindOptions options = new FindOptions()
.sort("Text to vectorize");
FindIterable<Document> cursor = collection.find(options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vectorize": "Text to vectorize" }
}
}'
Use sorting to find documents
You can use a sort clause to sort documents by one or more fields.
For more information, see Sort clauses.
Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you can’t reference non-indexed fields in sort queries.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.constants import SortDocuments
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{"metadata.language": "English"},
sort={
"rating": SortDocuments.ASCENDING,
"title": SortDocuments.DESCENDING,
}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{ "metadata.language": "English" },
{ sort: {
rating: 1, // ascending
title: -1 // descending
} }
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.eq("metadata.language", "English");
FindOptions options = new FindOptions()
.sort(Sorts.ascending("rating"))
.sort(Sorts.descending("title"));
FindIterable<Document> cursor = collection.find(filter, options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": { "metadata.language": "English" },
"sort": {
"rating": 1,
"title": -1
}
}
}'
Use an empty filter to find all documents
To find all documents, use an empty filter.
You should avoid this if you have a large number of documents.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find({})
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find({});
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
FindIterable<Document> cursor = collection.find();
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {}
}
}'
Include the similarity score with the result
If you use a vector search to find documents, you can also include a $similarity
property for each document in the result. The $similarity
value represents the closeness of the sort vector and the document’s vector.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{},
sort={"$vectorize": "Text to vectorize"},
include_similarity=True
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{},
{
sort: { $vectorize: "Text to vectorize" },
includeSimilarity: true
},
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
FindOptions options = new FindOptions()
.sort("Text to vectorize")
.includeSimilarity();
FindIterable<Document> cursor = collection.find(options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vectorize": "Text to vectorize" },
"options": { "includeSimilarity": true }
}
}'
Include the sort vector with the result
If you use a vector search to find documents, you can also include the sort vector in the result. This can be useful if you do a vector search with $vectorize
, since you don’t know the sort vector in advance.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{},
sort={"$vectorize": "Text to vectorize"},
include_sort_vector=True
)
# Get the sort vector from the result
vector = cursor.get_sort_vector()
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{},
{
sort: { $vectorize: "Text to vectorize" },
includeSortVector: true
},
);
// Get the sort vector from the result
const vector = await cursor.getSortVector();
console.log(vector);
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
FindOptions options = new FindOptions()
.sort("Text to vectorize")
.includeSortVector();
FindIterable<Document> cursor = collection.find(options);
// Get the sort vector from the result
System.out.println(cursor.getSortVector());
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vectorize": "Text to vectorize" },
"options": { "includeSortVector": true }
}
}'
Returns an object with a status.sortVector
property:
{
"data": {
"documents": [
{
"_id":"cdb92916-1f6b-413b-b929-161f6b313b96",
"author":"Rachel Jacobson",
"numberOfPages":223
},{
"_id":"582e9ed9-913c-40a6-ae9e-d9913ce0a6b0",
"author":"Jon Hill",
"numberOfPages":716
}
],
"nextPageState":null
},
"status":{
"sortVector": [0.28, 0.36, 0.45, ...]
}
}
Include only specific fields in the response
To specify which fields to include or exclude in the returned document, use a projection.
Certain fields, like $vector
and $vectorize
, are excluded by default and will only be returned if you specify that they should be included. Certain fields, like _id
, are included by default.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{"metadata.language": "English"},
projection={"isCheckedOut": True, "title": True}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find(
{ "metadata.language": "English" },
{ projection: { isCheckedOut: true, title: true} },
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.eq("metadata.language", "English");
FindOptions options = new FindOptions()
.projection(Projections.include("isCheckedOut", "title"));
FindIterable<Document> cursor = collection.find(filter, options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {"metadata.language": "English"},
"projection": {"isCheckedOut": true, "title": true}
}
}'
Exclude specific fields from the response
To specify which fields to include or exclude in the returned document, use a projection.
Certain fields, like $vector
and $vectorize
, are excluded by default and will only be returned if you specify that they should be included. Certain fields, like _id
, are included by default.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{"metadata.language": "English"},
projection={"isCheckedOut": False, "title": False}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find(
{ "metadata.language": "English" },
{ projection: { isCheckedOut: false, title: false} },
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.eq("metadata.language", "English");
FindOptions options = new FindOptions()
.projection(Projections.exclude("isCheckedOut", "title"));
FindIterable<Document> cursor = collection.find(filter, options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {"metadata.language": "English"},
"projection": {"isCheckedOut": false, "title": false}
}
}'
Limit the number of documents returned
Specify a limit to only fetch up to a certain number of documents.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{"metadata.language": "English"},
limit=10
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find(
{ "metadata.language": "English" },
{ limit: 10 },
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.eq("metadata.language", "English");
FindOptions options = new FindOptions()
.limit(10);
FindIterable<Document> cursor = collection.find(filter, options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {"metadata.language": "English"},
"options": {
"limit": 10
}
}
}'
Skip documents
You can specify a number of documents to skip (bypass) before returning documents.
You can only do this if your find
explicitly includes an ascending or descending sort criterion.
You cannot do this in conjunction with vector search.
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.constants import SortDocuments
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{"metadata.language": "English"},
sort={
"rating": SortDocuments.ASCENDING,
"title": SortDocuments.DESCENDING,
},
skip=5
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
// Find documents
(async function () {
const cursor = collection.find(
{ "metadata.language": "English" },
{
sort: {
rating: 1, // ascending
title: -1 // descending
},
skip: 5
}
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.eq("metadata.language", "English");
FindOptions options = new FindOptions()
.sort(Sorts.ascending("rating"))
.sort(Sorts.descending("title"))
.skip(5);
FindIterable<Document> cursor = collection.find(filter, options);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": { "metadata.language": "English" },
"sort": {
"rating": 1,
"title": -1
},
"options": {
"skip": 5
}
}
}'
Use filter, sort, and projection together
-
Python
-
TypeScript
-
Java
-
curl
from astrapy import DataAPIClient
from astrapy.constants import SortDocuments
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find(
{
"$and": [
{"isCheckedOut": False},
{"numberOfPages": {"$lt": 300}},
]
},
sort={
"rating": SortDocuments.ASCENDING,
"title": SortDocuments.DESCENDING,
},
projection={"isCheckedOut": True, "title": True}
)
# Iterate over the found documents
for document in cursor:
print(document)
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find(
{
$and: [{ isCheckedOut: false }, { numberOfPages: { $lt: 300 } }],
},
{
sort: {
rating: 1, // ascending
title: -1, // descending
},
projection: {
isCheckedOut: true,
title: true,
},
},
);
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import java.util.Optional;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.and(
Filters.eq("isCheckedOut", false),
Filters.lt("numberOfPages", 300));
FindOptions options = new FindOptions()
.sort(Sorts.ascending("rating"))
.sort(Sorts.descending("title"))
.projection(Projections.include("isCheckedOut", "title"));
FindIterable<Document> cursor = collection.find(filter);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {"$and": [
{"isCheckedOut": false},
{"numberOfPages": {"$lt": 300}}
]},
"sort": {
"rating": 1,
"title": -1
},
"projection": {"isCheckedOut": true, "title": true}
}
}'
Iterate over found documents
-
Python
-
TypeScript
-
Java
-
curl
Use a for
loop to iterate over the cursor. The client will periodically fetch more documents until no matching documents remain.
from astrapy import DataAPIClient
# Get an existing collection
client = DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
database = client.get_database("ASTRA_DB_API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")
# Find documents
cursor = collection.find({
"$and": [
{"isCheckedOut": False},
{"numberOfPages": {"$lt": 300}},
]
})
# Iterate over the found documents
for document in cursor:
print(document)
The cursor returned by find()
is compatible with for
loops and next()
. The client will periodically fetch more documents until no matching documents remain.
import { DataAPIClient } from '@datastax/astra-db-ts';
// Get an existing collection
const client = new DataAPIClient('ASTRA_DB_APPLICATION_TOKEN');
const database = client.db('ASTRA_DB_API_ENDPOINT');
const collection = database.collection('COLLECTION_NAME');
(async function () {
// Find documents
const cursor = collection.find({
$and: [{ isCheckedOut: false }, { numberOfPages: { $lt: 300 } }],
});
// Get the next item in the cursor
console.log(await cursor.next());
// Iterate over the found documents
for await (const document of cursor) {
console.log(document);
}
})();
The cursor returned by find()
is an Iterable
and is compatible with for
loops. The client will periodically fetch more documents until no matching documents remain.
package com.datastax.astra.client.collection;
import com.datastax.astra.client.Collection;
import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.model.Document;
import com.datastax.astra.client.model.Filter;
import com.datastax.astra.client.model.Filters;
import com.datastax.astra.client.model.FindOptions;
import com.datastax.astra.client.model.FindIterable;
public class Find {
public static void main(String[] args) {
// Get an existing collection
Collection<Document> collection = new DataAPIClient("ASTRA_DB_APPLICATION_TOKEN")
.getDatabase("ASTRA_DB_API_ENDPOINT")
.getCollection("COLLECTION_NAME");
// Find documents
Filter filter = Filters.and(
Filters.eq("isCheckedOut", false),
Filters.lt("numberOfPages", 300));
FindIterable<Document> cursor = collection.find(filter);
// Iterate over the found documents
for (Document document : cursor) {
System.out.println(document);
}
}
}
If the response includes a non-null nextPageState
, then the specified sort
or filter
operation supports pagination, and more documents than the ones already returned exist.
To fetch additional documents, you must send a request with the nextPageState
value from your previous request. For example:
-
Send an initial request
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \ --header "Token: ASTRA_DB_APPLICATION_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "find": { "filter": {"isCheckedOut": false} } }'
-
Get the
data.documents.nextPageState
value from the response{ "data": { "documents": [ { "_id": { "$uuid": "018e65c9-df45-7913-89f8-175f28bd7f74" } }, { "_id": { "$uuid": "018e65c9-e33d-749b-9386-e848739582f0" } } ], "nextPageState": "NEXT_PAGE_STATE" } }
-
Use the
data.documents.nextPageState
from the previous response to request the next page of results.curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \ --header "Token: ASTRA_DB_APPLICATION_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "find": { "filter": {"isCheckedOut": false}, "options": { "pageState": "NEXT_PAGE_STATE_FROM_PRIOR_RESPONSE" } } }'
-
Once
nextPageState
isnull
, you have fetched all matching documents.{ "data": { "documents": [ { "_id": { "$uuid": "018e65c9-df45-7913-89f8-175f28bd7f74" } }, { "_id": { "$uuid": "018e65c9-e33d-749b-9386-e848739582f0" } } ], "nextPageState": null } }
Client reference
-
Python
-
TypeScript
-
Java
-
curl
For more information, see the client reference.
For more information, see the client reference.
For more information, see the client reference.
Client reference documentation is not applicable for HTTP.