Find documents

Finds documents in a collection using filter and sort clauses, including vector search.

To find documents with hybrid search, see Find and rerank documents.

If you add or remove documents after starting the operation, the result might not reflect real-time changes in the data.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

Python
TypeScript
Java
curl

Returns a cursor (CollectionFindCursor) for iterating over documents that match the specified filter and sort clauses.

The fields included in the returned documents depend on the subset of fields that were requested in the projection.

If requested and applicable, each document will also include a $similarity key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.

If requested when executing a vector search, the result will also include the sort vector.

Cursors are lazy iterators, meant to be consumed with for loops (or equivalent constructs). You must iterate over the cursor to fetch matching documents.

If you need a list of all results, you can call the to_list method on the cursor. However, the time and memory required for this operation depend on the number of results.

For more information about the operations available on cursors, see FindCursor.

Returns a cursor (CollectionFindCursor<Schema, Schema>) for iterating over documents that match the specified filter and sort clauses.

The fields included in the returned documents depend on the subset of fields that were requested in the projection.

If requested and applicable, each document will also include a $similarity key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.

If requested when executing a vector search, the result will also include the sort vector.

The cursor is compatible with for loops. You must iterate over the cursor to fetch matching documents. The cursor transitions through the following statuses: . initialized: no documents have been consumed . running: some but not all of the documents have been consumed . exhausted: all documents have been consumed

If you need a list of all results, you can call list() on the cursor instead of iterating over the cursor. However, the time and memory required for this operation depend on the number of results.

Returns a cursor (CollectionFindCursor<T, T>) for iterating over documents that match the specified filter and sort clauses.

The fields included in the returned documents depend on the subset of fields that were requested in the projection.

If requested and applicable, each document will also include a $similarity key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.

If requested when executing a vector search, the result will also include the sort vector.

The cursor is an Iterable and is compatible with for loops. You must iterate over the cursor to fetch matching documents.

If you need a list of all results, you can use .all() to exhaust the cursor. However, the time and memory required for this operation depend on the number of results.

The response includes a data.documents property, which is an array of objects representing documents that match the specified filter and sort clauses.

The fields included in the returned documents depend on the subset of fields that were requested in the projection. If requested and applicable, each document will also include a $similarity key with a numeric similarity score that represents the closeness of the sort vector and the document’s vector.

If the query supports pagination, the response also includes a data.nextPageState property, which indicates the ID of the next page of results, if any. For non-vector searches, the results will be paginated if more than 20 documents match the specified filter and sort clauses.

For vector search, returns a single page of up to 1000 documents (or a lower amount if specified) instead of a cursor.

If requested when executing a vector search, the result also includes a status.sortVector property, which is the sort vector used for the search.

Example response:

{
  "data": {
    "documents":[
      {
        "_id":"85a54382-9227-4075-a543-829227407556",
        "title":"Within Silence of the Past",
        "is_checked_out":false
      },
      {
        "_id":"aa762475-4fc1-4477-b624-754fc1f477c7",
        "title":"Beyond Dreams and Forgotten Worlds",
        "is_checked_out":false
      }
    ],
    "nextPageState":"LQAAAAEBAAAAJGQ2OTk5NzY2LTgyODQtNDc3Mi05OTk3LTY2ODI4NGU3NzJjYQDwf///6wA="
  }
}

Example response if no documents were found:

{
  "data": {
    "documents": [],
    "nextPageState": null
  }
}

Parameters

Python
TypeScript
Java
curl

Use the find method, which belongs to the astrapy.Collection class.

Method signature

find(
  filter: Dict[str, Any],
  *,
  projection: Dict[str, bool],
  document_type: type,
  skip: int,
  limit: int,
  include_similarity: bool,
  include_sort_vector: bool,
  sort: Dict[str, Any],
  request_timeout_ms: int,
  timeout_ms: int,
) -> CollectionFindCursor

Name Type Summary

Name	Type	Summary
`filter`	`Dict[str, Any]`	Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`projection`	`Dict[str, bool]`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`document_type`	`type`	A formal specifier for the type checker. Use this parameter if your code is strictly typed, especially if you specify a projection. For more information, see Typing support. Default: `CollectionFindCursor[DOC, DOC]`. (Maintains the same type for the returned documents as that of the documents in the collection.)
`skip`	`int`	Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first `n` documents matching the query, and the results begin at the `n+1` document. This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search.
`limit`	`int`	Limit the total number of documents returned. Once `limit` is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned. For vector search, a lower limit reduces the accuracy of the search and the time required for the search.
`include_similarity`	`bool`	Optional. Whether to include a `$similarity` property in the response. The `$similarity` value represents the closeness of the sort vector and the document’s vector. This parameter only applies if you use a vector search. Default: False
`include_sort_vector`	`bool`	Optional. Whether to include the sort vector in the response. This can be useful if you do a vector search with `$vectorize`, since you don’t know the sort vector in advance. This parameter only applies if you use a vector search. Default: False
`sort`	`Dict[str, Any]`	Optional. Sorts documents by one or more fields, or performs a vector search. You must use `&` to escape any `.` or `&` in field names in the sort clause. You cannot use `&` to escape any other characters. For more information, see Sort clauses for collections. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries. For vector searches, this parameter can use either `$vector` or `$vectorize`.
`request_timeout_ms`	`int`	Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. For more information, see Timeout options.
`timeout_ms`	`int`	Optional. An alias for `request_timeout_ms`.

filter

Dict[str, Any]

Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

projection

Dict[str, bool]

Optional. Controls which fields are included or excluded in the returned document.

You must use & to escape any . or & in field names in the projection clause. You cannot use & to escape any other characters.

For more information, see Projections for collections.

Default: The default projection for the collection. All fields prefixed with $ are excluded by default and will only be returned if you include them in the projection. _id is included by default and will always be returned unless you exclude them from the projection.

document_type

type

A formal specifier for the type checker. Use this parameter if your code is strictly typed, especially if you specify a projection.

For more information, see Typing support.

Default: CollectionFindCursor[DOC, DOC]. (Maintains the same type for the returned documents as that of the documents in the collection.)

skip

int

Optional. The number of documents to bypass (skip) before returning documents.

The API excludes the first n documents matching the query, and the results begin at the n+1 document.

This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search.

limit

int

Limit the total number of documents returned. Once limit is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned.

For vector search, a lower limit reduces the accuracy of the search and the time required for the search.

include_similarity

bool

Optional. Whether to include a $similarity property in the response.

The $similarity value represents the closeness of the sort vector and the document’s vector.

This parameter only applies if you use a vector search.

Default: False

include_sort_vector

bool

Optional. Whether to include the sort vector in the response.

This can be useful if you do a vector search with $vectorize, since you don’t know the sort vector in advance.

This parameter only applies if you use a vector search.

Default: False

sort

Dict[str, Any]

Optional. Sorts documents by one or more fields, or performs a vector search.

You must use & to escape any . or & in field names in the sort clause. You cannot use & to escape any other characters.

For more information, see Sort clauses for collections.

Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries.

For vector searches, this parameter can use either $vector or $vectorize.

request_timeout_ms

int

Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request.

Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object. For more information, see Timeout options.

timeout_ms

int

Optional. An alias for request_timeout_ms.

Use the find method, which belongs to the Collection class.

Method signature

find(
  filter: CollectionFilter<Schema>,
  options?: {
    sort?: Sort,
    projection?: Projection,
    limit?: number,
    skip?: number
    includeSimilarity?: boolean,
    includeSortVector?: boolean,
    timeout?: number | TimeoutDescriptor,
  },
): CollectionFindCursor<Schema, Schema>

Name Type Summary

Name	Type	Summary
`filter`	`CollectionFilter<Schema>`	An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`options`	`CollectionFindOptions`	Optional. The options for this operation. See Properties of `options` for more details.

filter

CollectionFilter<Schema>

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

options

CollectionFindOptions

Optional. The options for this operation. See Properties of options for more details.

Properties of `options`
Name	Type	Summary
`projection`	`Projection`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`includeSimilarity`	`boolean`	Optional. Whether to include a `$similarity` property in the response. The `$similarity` value represents the closeness of the sort vector and the document’s vector. This parameter only applies if you use a vector search. Default: False
`includeSortVector`	`boolean`	Optional. Whether to include the sort vector in the response. This can be useful if you do a vector search with `$vectorize`, since you don’t know the sort vector in advance. This parameter only applies if you use a vector search. Default: False
`sort`	`Sort`	Optional. Sorts documents by one or more fields, or performs a vector search. You must use `&` to escape any `.` or `&` in field names in the sort clause. You cannot use `&` to escape any other characters. For more information, see Sort clauses for collections. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries. For vector searches, this parameter can use either `$vector` or `$vectorize`.
`skip`	`number`	Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first `n` documents matching the query, and the results begin at the `n+1` document. This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search.
`limit`	`number`	Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search.
`timeout`	`number` \| `TimeoutDescriptor`	Optional. The timeout(s) to apply to this method. You can specify `requestTimeoutMs` and `generalMethodTimeoutMs`. Since this method issues a single HTTP request, these timeouts are equivalent. Details about the `timeout` parameter The `TimeoutDescriptor` object can contain these properties: `requestTimeoutMs` (`number`): The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. `generalMethodTimeoutMs` (`number`): The maximum time, in milliseconds, that the whole operation can take. Since this method issues a single HTTP request, `generalMethodTimeoutMs` and `requestTimeoutMs` are equivalent. If you specify both, the minimum of the two will be used. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. If you specify a number instead of a `TimeoutDescriptor` object, that number will be applied to both `requestTimeoutMs` and `generalMethodTimeoutMs`.

Use the find method, which belongs to the com.datastax.astra.client.Collection class.

Method signature

CollectionFindCursor<T, T> find(Filter filter, CollectionFindOptions options)

CollectionFindCursor<T, T> find(Filter filter)

CollectionFindCursor<T, T> find(CollectionFindOptions options)

Name Type Summary

Name	Type	Summary
`filter`	`Filter`	Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`options`	`CollectionFindOptions`	Optional. The options for this operation. See Methods of the `CollectionFindOptions` class for more details.

filter

Filter

Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

options

CollectionFindOptions

Optional. The options for this operation. See Methods of the CollectionFindOptions class for more details.

Methods of the `CollectionFindOptions` class
Method	Parameters	Summary
`sort()`	`float[] \| String \| Sort \| Map<String, Object>`	Optional. Sorts documents by one or more fields, or performs a vector search. You must use `&` to escape any `.` or `&` in field names in the sort clause. You cannot use `&` to escape any other characters. For more information, see Sort clauses for collections. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries. For vector searches, this parameter can use either `$vector` or `$vectorize`.
`projection()`	`Projection`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`includeSimilarity()`	`boolean`	Optional. Whether to include a `$similarity` property in the response. The `$similarity` value represents the closeness of the sort vector and the document’s vector. This parameter only applies if you use a vector search. Default: False
`includeSortVector()`	`boolean`	Optional. Whether to include the sort vector in the response. This can be useful if you do a vector search with `$vectorize`, since you don’t know the sort vector in advance. This parameter only applies if you use a vector search. Default: False
`skip()`	`int`	Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first `n` documents matching the query, and the results begin at the `n+1` document. This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search.
`limit()`	`int`	Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search.

Use the find command.

Command signature

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
  "find": {
    "filter": FILTER,
    "sort": SORT,
    "projection": PROJECTION,
    "options": {
      "includeSimilarity": BOOLEAN,
      "includeSortVector": BOOLEAN,
      "skip": INTEGER,
      "limit": INTEGER
    }
  }
}'

Name Type Summary

Name	Type	Summary
`filter`	`object`	Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`sort`	`object`	Optional. Sorts documents by one or more fields, or performs a vector search. You must use `&` to escape any `.` or `&` in field names in the sort clause. You cannot use `&` to escape any other characters. For more information, see Sort clauses for collections. Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries. For vector searches, this parameter can use either `$vector` or `$vectorize`.
`projection`	`object`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`options`	`object`	Optional. The options for this operation. See Properties of `options` for more details.

filter

object

Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

sort

object

Optional. Sorts documents by one or more fields, or performs a vector search.

You must use & to escape any . or & in field names in the sort clause. You cannot use & to escape any other characters.

For more information, see Sort clauses for collections.

Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries.

For vector searches, this parameter can use either $vector or $vectorize.

projection

object

Optional. Controls which fields are included or excluded in the returned document.

You must use & to escape any . or & in field names in the projection clause. You cannot use & to escape any other characters.

For more information, see Projections for collections.

options

object

Optional. The options for this operation. See Properties of options for more details.

Properties of `options`
Name	Type	Summary
`includeSimilarity`	`boolean`	Optional. Whether to include a `$similarity` property in the response. The `$similarity` value represents the closeness of the sort vector and the document’s vector. This parameter only applies if you use a vector search. Default: False
`includeSortVector`	`boolean`	Optional. Whether to include the sort vector in the response. This can be useful if you do a vector search with `$vectorize`, since you don’t know the sort vector in advance. This parameter only applies if you use a vector search. Default: False
`skip`	`integer`	Optional. The number of documents to bypass (skip) before returning documents. The API excludes the first `n` documents matching the query, and the results begin at the `n+1` document. This parameter only applies if you also explicitly specify an ascending or descending sort criterion. This parameter is not valid with vector search.
`limit`	`integer`	Optional. The maximum number of documents to fetch. For vector search, a lower limit reduces the accuracy of the search and the time required for the search.
`pageState`	`string`	Optional. The value of `status.nextPageState` from the previous request. Used to request the next page of results.

Examples

The following examples demonstrate how to find documents in a collection.

Use filters to find documents

You can use a filter to find documents that match specific criteria. For example, you can find documents with an is_checked_out value of false and a number_of_pages value less than 300.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {
        "$and": [
            {"is_checked_out": False},
            {"number_of_pages": {"$lt": 300}},
        ]
    }
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find({
    $and: [{ is_checked_out: false }, { number_of_pages: { $lt: 300 } }],
  });

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter =
        Filters.and(Filters.eq("is_checked_out", false), Filters.lt("number_of_pages", 300));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"$and": [
      {"is_checked_out": false},
      {"number_of_pages": {"$lt": 300}}
    ]}
  }
}'

Use vector search to find documents

To find the documents whose $vector value is most similar to a given vector, use a sort with the vector embeddings that you want to match. For more information, see Find data with vector search.

Vector search is only available for vector-enabled collections. For more information, see Create a collection that can store vector embeddings and $vector in collections.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find({}, sort={"$vector": [0.12, 0.52, 0.32]})

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find({}, { sort: { $vector: [0.12, 0.52, 0.32] } });

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    CollectionFindOptions options =
        new CollectionFindOptions().sort(Sort.vector(new float[] {0.12f, 0.52f, 0.32f}));
    CollectionFindCursor<Document, Document> cursor = collection.find(options);
    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

You can provide the search vector as an array of floats, or you can use $binary to provide the search vector as a Base64-encoded string. $binary can be more performant.

Array of floats
$binary

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "sort": { "$vector": [.12, .52, .32] }
  }
}'

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "sort": { "$vector": {"$binary": "PfXCjz8FHrg+o9cK"} }
  }
}'

Use vector search and vectorize to find documents

To find the document whose $vector value is most similar to the $vector value of a given search string, use a sort with the search string that you want to vectorize and match. For more information, see Find data with vector search.

Vector search with vectorize is only available for collections that have vectorize enabled. For more information, see Create a collection that can automatically generate vector embeddings and $vectorize in collections.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find({}, sort={"$vectorize": "Text to vectorize"})

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find(
    {},
    { sort: { $vectorize: "Text to vectorize" } },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    CollectionFindOptions options =
        new CollectionFindOptions().sort(Sort.vectorize("Text to vectorize"));
    CollectionFindCursor<Document, Document> cursor = collection.find(options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "sort": { "$vectorize": "Text to vectorize" }
  }
}'

Use sorting to find documents

You can use a sort clause to sort documents by one or more fields.

For more information, see Sort clauses for collections.

Sort clauses can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in sort queries.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient
from astrapy.constants import SortMode

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {"metadata.language": "English"},
    sort={
        "rating": SortMode.ASCENDING,
        "title": SortMode.DESCENDING,
    },
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find(
    { "metadata.language": "English" },
    {
      sort: {
        rating: 1, // ascending
        title: -1, // descending
      },
    },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter = Filters.eq("metadata.language", "English");
    CollectionFindOptions options =
        new CollectionFindOptions().sort(Sort.ascending("rating"), Sort.descending("title"));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": { "metadata.language": "English" },
    "sort": {
      "rating": 1,
      "title": -1
    }
  }
}'

Use an empty filter to find all documents

To find all documents, use an empty filter.

You should avoid this if you have a large number of documents.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find({})

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find({});

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    CollectionFindCursor<Document, Document> cursor = collection.find((Filter) null);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {}
  }
}'

Include the similarity score with the result

If you use a vector search to find documents, you can also include a $similarity property for each document in the result. The $similarity value represents the closeness of the sort vector and the document’s vector.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {}, sort={"$vectorize": "Text to vectorize"}, include_similarity=True
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find(
    {},
    {
      sort: { $vectorize: "Text to vectorize" },
      includeSimilarity: true,
    },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    CollectionFindOptions options =
        new CollectionFindOptions()
            .sort(Sort.vectorize("Text to vectorize"))
            .includeSimilarity(true);
    CollectionFindCursor<Document, Document> cursor = collection.find(options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "sort": { "$vectorize": "Text to vectorize" },
    "options": { "includeSimilarity": true }
  }
}'

Include the sort vector with the result

If you use a vector search to find documents, you can also include the sort vector in the result. This can be useful if you do a vector search with $vectorize, since you don’t know the sort vector in advance.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {}, sort={"$vectorize": "Text to vectorize"}, include_sort_vector=True
)

# Get the sort vector from the result
vector = cursor.get_sort_vector()

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find(
    {},
    {
      sort: { $vectorize: "Text to vectorize" },
      includeSortVector: true,
    },
  );

  // Get the sort vector from the result
  const vector = await cursor.getSortVector();
  console.log(vector);
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    CollectionFindOptions options =
        new CollectionFindOptions()
            .sort(Sort.vectorize("Text to vectorize"))
            .includeSortVector(true);
    CollectionFindCursor<Document, Document> cursor = collection.find(options);

    // Get the sort vector from the result
    System.out.println(cursor.getSortVector());
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "sort": { "$vectorize": "Text to vectorize" },
    "options": { "includeSortVector": true }
  }
}'

Returns an object with a status.sortVector property:

{
  "data": {
    "documents": [
      {
        "_id":"cdb92916-1f6b-413b-b929-161f6b313b96",
        "author":"Rachel Jacobson",
        "number_of_pages":223
      },{
        "_id":"582e9ed9-913c-40a6-ae9e-d9913ce0a6b0",
        "author":"Jon Hill",
        "number_of_pages":716
      }
    ],
    "nextPageState":null
  },
  "status":{
    "sortVector": [0.28, 0.36, 0.45, ...]
  }
}

Include only specific fields in the response

To specify which fields to include or exclude in the returned documents, use a projection.

All fields prefixed with $ are excluded by default and will only be returned if you include them in the projection. _id is included by default and will always be returned unless you exclude them from the projection.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {"metadata.language": "English"},
    projection={"is_checked_out": True, "title": True},
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find(
    { "metadata.language": "English" },
    { projection: { is_checked_out: true, title: true } },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Projection;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter = Filters.eq("metadata.language", "English");
    CollectionFindOptions options =
        new CollectionFindOptions().projection(Projection.include("is_checked_out", "title"));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"metadata.language": "English"},
    "projection": {"is_checked_out": true, "title": true}
  }
}'

Exclude specific fields from the response

To specify which fields to include or exclude in the returned document, use a projection.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {"metadata.language": "English"},
    projection={"is_checked_out": False, "title": False},
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find(
    { "metadata.language": "English" },
    { projection: { is_checked_out: false, title: false } },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Projection;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter = Filters.eq("metadata.language", "English");
    CollectionFindOptions options =
        new CollectionFindOptions().projection(Projection.exclude("is_checked_out", "title"));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"metadata.language": "English"},
    "projection": {"is_checked_out": false, "title": false}
  }
}'

Limit the number of documents returned

Specify a limit to only fetch up to a certain number of documents.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find({"metadata.language": "English"}, limit=10)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find(
    { "metadata.language": "English" },
    { limit: 10 },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter = Filters.eq("metadata.language", "English");
    CollectionFindOptions options = new CollectionFindOptions().limit(10);
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"metadata.language": "English"},
    "options": {
      "limit": 10
    }
  }
}'

Skip documents

You can specify a number of documents to skip (bypass) before returning documents.

You can only do this if your find explicitly includes an ascending or descending sort criterion. You cannot do this in conjunction with vector search.

Python
TypeScript
Java
curl

from astrapy import DataAPIClient
from astrapy.constants import SortMode

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {"metadata.language": "English"},
    sort={
        "rating": SortMode.ASCENDING,
        "title": SortMode.DESCENDING,
    },
    skip=5,
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find documents
(async function () {
  const cursor = collection.find(
    { "metadata.language": "English" },
    {
      sort: {
        rating: 1, // ascending
        title: -1, // descending
      },
      skip: 5,
    },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter = Filters.eq("metadata.language", "English");
    CollectionFindOptions options =
        new CollectionFindOptions()
            .sort(Sort.ascending("rating"), Sort.descending("title"))
            .skip(5);
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": { "metadata.language": "English" },
    "sort": {
      "rating": 1,
      "title": -1
    },
    "options": {
      "skip": 5
    }
  }
}'

Use filter, sort, and projection together

Python
TypeScript
Java
curl

from astrapy import DataAPIClient
from astrapy.constants import SortMode

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {
        "$and": [
            {"is_checked_out": False},
            {"number_of_pages": {"$lt": 300}},
        ]
    },
    sort={
        "rating": SortMode.ASCENDING,
        "title": SortMode.DESCENDING,
    },
    projection={"is_checked_out": True, "title": True},
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find(
    {
      $and: [{ is_checked_out: false }, { number_of_pages: { $lt: 300 } }],
    },
    {
      sort: {
        rating: 1, // ascending
        title: -1, // descending
      },
      projection: {
        is_checked_out: true,
        title: true,
      },
    },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Projection;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter =
        Filters.and(Filters.eq("is_checked_out", false), Filters.lt("number_of_pages", 300));
    CollectionFindOptions options =
        new CollectionFindOptions()
            .sort(Sort.ascending("rating"), Sort.descending("title"))
            .projection(Projection.include("is_checked_out", "title"));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"$and": [
      {"is_checked_out": false},
      {"number_of_pages": {"$lt": 300}}
    ]},
    "sort": {
      "rating": 1,
      "title": -1
    },
    "projection": {"is_checked_out": true, "title": true}
  }
}'

Iterate over found documents

Python
TypeScript
Java
curl

Use a for loop to iterate over the cursor. The client will periodically fetch more documents until no matching documents remain.

from astrapy import DataAPIClient

# Get an existing collection
client = DataAPIClient()
database = client.get_database(
    "API_ENDPOINT",
    token="APPLICATION_TOKEN",
)
collection = database.get_collection("COLLECTION_NAME")

# Find documents
cursor = collection.find(
    {
        "$and": [
            {"is_checked_out": False},
            {"number_of_pages": {"$lt": 300}},
        ]
    }
)

# Iterate over the found documents
for document in cursor:
    print(document)

The cursor returned by find() is compatible with for loops and next(). The client will periodically fetch more documents until no matching documents remain.

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

(async function () {
  // Find documents
  const cursor = collection.find({
    $and: [{ is_checked_out: false }, { number_of_pages: { $lt: 300 } }],
  });

  // Get the next item in the cursor
  console.log(await cursor.next());

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

The cursor returned by find() is an Iterable and is compatible with for loops. The client will periodically fetch more documents until no matching documents remain.

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find documents
    Filter filter =
        Filters.and(Filters.eq("is_checked_out", false), Filters.lt("number_of_pages", 300));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

If the response includes a non-null nextPageState, then the specified sort or filter operation supports pagination, and more documents than the ones already returned exist.

To fetch additional documents, you must send a request with the nextPageState value from your previous request. For example:

Send an initial request

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"is_checked_out": false}
  }
}'

Get the data.documents.nextPageState value from the response

{
  "data": {
    "documents": [
      {
        "_id": { "$uuid": "018e65c9-df45-7913-89f8-175f28bd7f74" }
      },
      {
        "_id": { "$uuid": "018e65c9-e33d-749b-9386-e848739582f0" }
      }
    ],
    "nextPageState": "NEXT_PAGE_STATE"
  }
}

Use the data.documents.nextPageState from the previous response to request the next page of results.

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"is_checked_out": false},
    "options": {
      "pageState": "NEXT_PAGE_STATE_FROM_PRIOR_RESPONSE"
    }
  }
}'

Once nextPageState is null, you have fetched all matching documents.

{
  "data": {
    "documents": [
      {
        "_id": { "$uuid": "018e65c9-df45-7913-89f8-175f28bd7f74" }
      },
      {
        "_id": { "$uuid": "018e65c9-e33d-749b-9386-e848739582f0" }
      }
    ],
    "nextPageState": null
  }
}

Work with `.` and `&` in field names

You must use & to escape any . or & in field names when the field is used in a filter, sort, or projection clause. Dot notation, which is used to reference nested fields, should not be escaped.

For example, in the following document, you would use escaping like this: areas.r&&d, costs.price&.usd, and costs.price&.cad.

{
  "areas": {
    "r&d": true,
    "design": false
  },
  "costs": {
    "price.usd": 100,
    "price.cad": 90
  }
}

Python
TypeScript
Java
curl

from astrapy import DataAPIClient
from astrapy.constants import SortMode

# Get an existing collection
client = DataAPIClient("APPLICATION_TOKEN")
database = client.get_database("API_ENDPOINT")
collection = database.get_collection("COLLECTION_NAME")

# Find a document
cursor = collection.find(
    {
        "$and": [
            {"areas.r&&d": False},
            {"costs.price&.usd": {"$lt": 300}},
        ]
    },
    sort={"costs.price&.usd": SortMode.ASCENDING},
    projection={"areas.r&&d": True, "costs.price&.cad": True},
)

# Iterate over the found documents
for document in cursor:
    print(document)

import { DataAPIClient } from "@datastax/astra-db-ts";

// Get an existing collection
const client = new DataAPIClient("APPLICATION_TOKEN");
const database = client.db("API_ENDPOINT");
const collection = database.collection("COLLECTION_NAME");

// Find a document
(async function () {
  const cursor = collection.find(
    {
      $and: [{ "areas.r&&d": false }, { "costs.price&.usd": { $lt: 300 } }],
    },
    {
      sort: {
        "costs.price&.usd": 1, // ascending
      },
      projection: {
        "areas.r&&d": true,
        "costs.price&.cad": true,
      },
    },
  );

  // Iterate over the found documents
  for await (const document of cursor) {
    console.log(document);
  }
})();

import com.datastax.astra.client.DataAPIClient;
import com.datastax.astra.client.collections.Collection;
import com.datastax.astra.client.collections.commands.cursor.CollectionFindCursor;
import com.datastax.astra.client.collections.commands.options.CollectionFindOptions;
import com.datastax.astra.client.collections.definition.documents.Document;
import com.datastax.astra.client.core.query.Filter;
import com.datastax.astra.client.core.query.Filters;
import com.datastax.astra.client.core.query.Projection;
import com.datastax.astra.client.core.query.Sort;

public class Example {

  public static void main(String[] args) {
    // Get an existing collection
    Collection<Document> collection =
        new DataAPIClient("APPLICATION_TOKEN")
            .getDatabase("API_ENDPOINT")
            .getCollection("COLLECTION_NAME");

    // Find a document
    Filter filter =
        Filters.and(Filters.eq("areas.r&&d", false), Filters.lt("costs.price&.usd", 300));
    CollectionFindOptions options =
        new CollectionFindOptions()
            .sort(Sort.ascending("costs.price&.usd"))
            .projection(Projection.include("areas.r&&d", "costs.price&.cad"));
    CollectionFindCursor<Document, Document> cursor = collection.find(filter, options);

    // Iterate over the found documents
    for (Document document : cursor) {
      System.out.println(document);
    }
  }
}

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
  --header "Token: APPLICATION_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
  "find": {
    "filter": {"$and": [
      {"areas.r&&d": false},
      {"costs.price&.usd": {"$lt": 300}}
    ]},
    "sort": {
      "costs.price&.usd": 1
    },
    "projection": {"areas.r&&d": true, "costs.price&.cad": true}
  }
}'

Client reference

Python
TypeScript
Java
curl

For more information, see the client reference.

Client reference documentation is not applicable for HTTP.

Find documents

Result

Parameters

Examples

Use filters to find documents

Use vector search to find documents

Use vector search and vectorize to find documents

Use sorting to find documents

Use an empty filter to find all documents

Include the similarity score with the result

Include the sort vector with the result

Include only specific fields in the response

Exclude specific fields from the response

Limit the number of documents returned

Skip documents

Use filter, sort, and projection together

Iterate over found documents

Work with `.` and `&` in field names

Client reference

Was this helpful?

Give Feedback