Find and rerank documents

Hybrid search, lexical search, and reranking are currently in public preview. Development is ongoing, and the features and functionality are subject to change. Astra DB Serverless, and the use of such, is subject to the DataStax Preview Terms.

Finds documents in a collection through a retrieval process that uses a reranker model to combine results from a vector search and a lexical search. This process is called hybrid search. For more information about hybrid search mechanics and best practices, see Find data with hybrid search.

For other ways to find documents, including standalone vector search and exact-value filters, see Find documents.

This method requires the following:

A Serverless (Vector) database in the AWS us-east-2 region.
A collection with vector, lexical, and rerank enabled. For more information, see Create a collection that supports hybrid search.
Documents with the $lexical and $vector fields populated. Documents without both of these fields are excluded from hybrid search.

Ready to write code? See the examples for this method to get started. If you are new to the Data API, check out the quickstart.

Result

Python
TypeScript
Java
curl

Returns a cursor (CollectionFindAndRerankCursor) for iterating over the documents returned by the reranker.

Iterating over the cursor yields RerankedResult objects, which represent the returned documents. The fields included in the returned documents depend on the subset of fields that were requested in the projection.

Each RerankedResult object also includes a dictionary of the scores from the retrieval process. If scores were not requested, the dictionary is empty.

If requested, the result will also include the sort vector used for the underlying vector search. Calling .get_sort_vector() on the cursor reads the sort vector.

Cursors are lazy iterators, meant to be consumed with for loops or equivalent constructs. You must iterate over the cursor to fetch matching documents and their scores. If you need a list of all results, you can call the to_list method on the cursor.

For more information about the operations available on cursors, see FindCursor.

Returns a cursor (CollectionFindCursor<Schema, Schema>) for iterating over the documents returned by the reranker.

Iterating over the cursor yields RerankedResult<TRaw> objects, which represent the returned documents. The fields included in the returned documents depend on the subset of fields that were requested in the projection.

Each RerankedResult object also includes a dictionary of the scores from the retrieval process. If scores were not requested, the dictionary is empty.

If requested, the result will also include the sort vector used for the underlying vector search. Calling .getSortVector() on the cursor reads the sort vector.

Cursors are lazy iterators, meant to be consumed with for await loops or equivalent constructs. You must iterate over the cursor to fetch matching documents and their scores. If you need a list of all results, you can call the toArray() method on the cursor.

For more information about the operations available on cursors, see FindCursor.

Returns a cursor (CollectionFindAndRerankCursor) for iterating over the documents returned by the reranker.

Iterating over the cursor yields RerankedResult objects, which represent the returned documents.

The fields included in the returned documents depend on the subset of fields that were requested in the projection.

Each RerankedResult object also includes a map of the scores from the retrieval process. If scores were not requested, the map is empty.

If requested, the result will also include the sort vector used for the underlying vector search. Calling .getSortVector() on the cursor reads the sort vector.

For more information about the methods available on cursors, see AbstractCursor.

The response includes a data.documents property, which is an array of objects representing the documents returned by the reranker. The fields included in the returned documents depend on the subset of fields that were requested in the projection.

If requested, the response also includes a status.documentResponses property, which is a list of the scores from the retrieval process for each document.

If requested, the response also includes a status.sortVector property, which is the sort vector used for the underlying vector search.

This command always returns a single page of results, so data.nextPageState in the response is always null.

Example response without requesting sort vector or scores:

{
    "data": {
        "documents": [
            {
                "$lexical": "the house on the hill",
                "_id": "doc_a",
                "content": "the house on the hill",
                "tag": "x"
            },
            {
                "$lexical": "the tree in the woods",
                "_id": "doc_b",
                "content": "the tree in the woods",
                "tag": "x"
            }
        ],
        "nextPageState": null
    }
}

Example response with sort vector and scores requested:

{
    "data": {
        "documents": [
            {
                "$lexical": "the house on the hill",
                "_id": "doc_a",
                "content": "the house on the hill",
                "tag": "x"
            },
            {
                "$lexical": "the tree in the woods",
                "_id": "doc_b",
                "content": "the tree in the woods",
                "tag": "x"
            }
        ],
        "nextPageState": null
    },
    "status": {
        "documentResponses": [
            {
                "scores": {
                    "$rerank": -9.1015625,
                    "$vector": 0.96291006
                }
            },
            {
                "scores": {
                    "$rerank": -12.515625,
                    "$vector": 0.19139329
                }
            }
        ],
        "sortVector": [1.0, 1.0, 1.0]
    }
}

Example response if no documents were found, with sort vector and scores requested:

{
    "data": {
        "documents": [],
        "nextPageState": null
    },
    "status": {
        "documentResponses": [],
        "sortVector": [1.0, 1.0, 1.0]
    }
}

Parameters

Python
TypeScript
Java
curl

Use the find_and_rerank method, which belongs to the astrapy.Collection class.

Method signature

find_and_rerank(
  filter: Dict[str, Any],
  *,
  sort: Dict[str, Any],
  projection: Dict[str, bool],
  document_type: type,
  limit: int,
  hybrid_limits: int | dict[str, int],
  include_scores: bool,
  include_sort_vector: bool,
  rerank_on: str,
  rerank_query: str,
  request_timeout_ms: int,
  timeout_ms: int,
) -> CollectionFindAndRerankCursor

Name Type Summary

Name	Type	Summary
`filter`	`Dict[str, Any]`	Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter. Default: No filter, meaning any document is a possible match.
`sort`	`dict[str, Any]`	Specifies queries for the underlying vector and lexical searches. The `$lexical` query is a string of space-separated keywords or terms. The `$vector` query is an array of floats or a `DataAPIVector` object that serves as a search vector. If you use this query, you must specify the `rerank_query` and `rerank_on` parameters. The `$vectorize` query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use `$vectorize`. `$vector` and `$vectorize` can’t be used together. You can also use shorthand to specify a single search string for both the `$vectorize` and `$lexical` queries. See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.
`projection`	`Dict[str, bool]`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`document_type`	`type`	Optional. A specifier for the type checker. You may want to use this parameter if your code is strictly typed, especially if you specify a projection. For more information, see Typing support. Default: `dict[str, Any]`. The returned cursor is implicitly a `CollectionFindAndRerankCursor[DOC, RerankedResult[DOC]]`, meaning that it maintains the same type for the returned documents as that of the documents in the collection.
`limit`	`int`	Optional. Limit the total number of documents returned. Once `limit` is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned. Default: The limit set by the Data API.
`hybrid_limits`	`int \| dict[str, int]`	Optional. Limit the number of documents returned by the underlying vector and lexical searches. If a single number is specified, it applies to both the vector and lexical searches. To set different limits for the vector and lexical searches, specify a dictionary in the form `{"$vector": INTEGER, "$lexical": INTEGER}`. Default: The value of `limit`.
`include_scores`	`bool`	Optional. Whether to include the scores from the reranking process in the response. These scores can be inspected in the `scores` attribute of each `RerankedResult` object yielded by the `CollectionFindAndRerankCursor`. This attribute is a free-form dictionary such as `{"$vector": 0.81, "$rerank": 0.12}`. See the examples for usage. If false, the `scores` attribute of each `RerankedResult` object is an empty dictionary. Default: False
`include_sort_vector`	`bool`	Optional. Whether to include the sort vector that was used for the underlying vector search in the response. This can be useful if you query through the `$vectorize` field instead of the `$vector` field, since you don’t know the sort vector in advance. The sort vector can be read by calling the `.get_sort_vector()` method on the returned cursor. Default: False
`rerank_on`	`str`	Required if you use `$vector` in `sort`; otherwise optional. The document field to use for the reranking step. Once the underlying vector and lexical searches complete, the reranker compares the `rerank_query` text with each document’s `rerank_on` field. The reserved `$lexical` field is often used for this parameter, but you can specify any field that stores a string. Any document lacking the field is excluded. Default unless you use `$vector` in `sort`: `"$lexical"`.
`rerank_query`	`str`	Required if you use `$vector` in `sort`; otherwise optional. Query text for the reranker step. Once the underlying vector and lexical searches complete, the reranker compares the `rerank_query` text with each document’s `rerank_on` field. Default unless you use `$vector` in `sort`: the query used for the underlying vector search, which is specified by the `sort` parameter.
`request_timeout_ms`	`int`	Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. For more information, see Timeout options.
`timeout_ms`	`int`	Optional. An alias for `request_timeout_ms`.

filter

Dict[str, Any]

Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

Default: No filter, meaning any document is a possible match.

sort

dict[str, Any]

Specifies queries for the underlying vector and lexical searches.

The $lexical query is a string of space-separated keywords or terms.
The $vector query is an array of floats or a DataAPIVector object that serves as a search vector. If you use this query, you must specify the rerank_query and rerank_on parameters.
The $vectorize query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use $vectorize.

$vector and $vectorize can’t be used together.

You can also use shorthand to specify a single search string for both the $vectorize and $lexical queries.

See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.

projection

Dict[str, bool]

Optional. Controls which fields are included or excluded in the returned document.

You must use & to escape any . or & in field names in the projection clause. You cannot use & to escape any other characters.

For more information, see Projections for collections.

Default: The default projection for the collection. All fields prefixed with $ are excluded by default and will only be returned if you include them in the projection. _id is included by default and will always be returned unless you exclude them from the projection.

document_type

type

Optional. A specifier for the type checker.

You may want to use this parameter if your code is strictly typed, especially if you specify a projection.

For more information, see Typing support.

Default: dict[str, Any]. The returned cursor is implicitly a CollectionFindAndRerankCursor[DOC, RerankedResult[DOC]], meaning that it maintains the same type for the returned documents as that of the documents in the collection.

limit

int

Optional. Limit the total number of documents returned. Once limit is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned.

Default: The limit set by the Data API.

hybrid_limits

int | dict[str, int]

Optional. Limit the number of documents returned by the underlying vector and lexical searches.

If a single number is specified, it applies to both the vector and lexical searches.

To set different limits for the vector and lexical searches, specify a dictionary in the form {"$vector": INTEGER, "$lexical": INTEGER}.

Default: The value of limit.

include_scores

bool

Optional. Whether to include the scores from the reranking process in the response.

These scores can be inspected in the scores attribute of each RerankedResult object yielded by the CollectionFindAndRerankCursor. This attribute is a free-form dictionary such as {"$vector": 0.81, "$rerank": 0.12}. See the examples for usage.

If false, the scores attribute of each RerankedResult object is an empty dictionary.

Default: False

include_sort_vector

bool

Optional. Whether to include the sort vector that was used for the underlying vector search in the response.

This can be useful if you query through the $vectorize field instead of the $vector field, since you don’t know the sort vector in advance.

The sort vector can be read by calling the .get_sort_vector() method on the returned cursor.

Default: False

rerank_on

str

Required if you use $vector in sort; otherwise optional.

The document field to use for the reranking step. Once the underlying vector and lexical searches complete, the reranker compares the rerank_query text with each document’s rerank_on field.

The reserved $lexical field is often used for this parameter, but you can specify any field that stores a string.

Any document lacking the field is excluded.

Default unless you use $vector in sort: "$lexical".

rerank_query

str

Required if you use $vector in sort; otherwise optional.

Query text for the reranker step.

Once the underlying vector and lexical searches complete, the reranker compares the rerank_query text with each document’s rerank_on field.

Default unless you use $vector in sort: the query used for the underlying vector search, which is specified by the sort parameter.

request_timeout_ms

int

Optional. The maximum time, in milliseconds, that the client should wait for each underlying HTTP request.

Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the Collection or DataAPIClient object. For more information, see Timeout options.

timeout_ms

int

Optional. An alias for request_timeout_ms.

Use the findAndRerank method, which belongs to the Collection class.

Method signature

findAndRerank(
  filter: CollectionFilter<Schema>,
  options?: {
    sort?: HybridSort,
    projection?: Projection,
    limit?: number,
    hybridLimits?: number | Record<string, number>
    rerankOn?: string,
    rerankQuery?: string,
    includeScores?: boolean,
    includeSortVector?: boolean,
    timeout?: number | TimeoutDescriptor,
  },
): CollectionFindAndRerankCursor<Schema, Schema>

Name Type Summary

Name	Type	Summary
`filter`	`CollectionFilter<Schema>`	An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`options`	`CollectionFindAndRerankOptions`	Optional. The options for this operation. See Properties of `options` for more details. Most of the options may be provided as either values in the options object, or as builder methods on the cursor itself (`cursor.sort(…)`).

filter

CollectionFilter<Schema>

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

options

CollectionFindAndRerankOptions

Optional. The options for this operation. See Properties of options for more details.

Most of the options may be provided as either values in the options object, or as builder methods on the cursor itself (cursor.sort(…)).

Properties of `options`
Name	Type	Summary
`sort`	`HybridSort`	Specifies queries for the underlying vector and lexical searches. The `$lexical` query is a string of space-separated keywords or terms. The `$vector` query is an array of floats or a `DataAPIVector` object that serves as a search vector. If you use this query, you must specify the `rerankQuery` and `rerankOn` parameters. The `$vectorize` query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use `$vectorize`. `$vector` and `$vectorize` can’t be used together. You can also use shorthand to specify a single search string for both the `$vectorize` and `$lexical` queries. See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.
`projection`	`Projection`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`limit`	`number`	Optional. Limit the total number of documents returned. Once `limit` is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned. Default: The limit set by the Data API.
`hybridLimits`	`number \| Record<string, number>`	Optional. Limit the number of documents returned by the underlying vector and lexical searches. If a single number is specified, it applies to both the vector and lexical searches. To set different limits for the vector and lexical searches, specify an object in the form `{$vector: INTEGER, $lexical: INTEGER}`. Default: The value of `limit`.
`includeScores`	`boolean`	Optional. Whether to include the scores from the reranking process in the response. These scores can be inspected in the `scores` attribute of each `RerankedResult<TRaw>` object yielded by the `CollectionFindAndRerankCursor`. This attribute is a free-form object such as `{ $vector: 0.81, $rerank: 0.12 }`. See the examples for usage. If false, the `scores` attribute of each `RerankedResult` object is an empty object. Default: False
`includeSortVector`	`boolean`	Optional. Whether to include the sort vector that was used for the underlying vector search in the response. This can be useful if you query through the `$vectorize` field instead of the `$vector` field, since you don’t know the sort vector in advance. The sort vector can be read by calling the `.getSortVector()` method on the returned cursor. Default: False
`rerankOn`	`string`	Required if you use `$vector` in `sort`; otherwise optional. The document field to use for the reranking step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. The reserved `$lexical` field is often used for this parameter, but you can specify any field that stores a string. Any document lacking the field is excluded. Default unless you use `$vector` in `sort`: `"$lexical"`.
`rerankQuery`	`string`	Required if you use `$vector` in `sort`; otherwise optional. Query text for the reranker step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. Default unless you use `$vector` in `sort`: the query used for the underlying vector search, which is specified by the `sort` parameter.
`timeout`	`number` \| `TimeoutDescriptor`	Optional. The timeout(s) to apply to this method. You can specify `requestTimeoutMs` and `generalMethodTimeoutMs`. Since this method issues a single HTTP request, these timeouts are equivalent. Details about the `timeout` parameter The `TimeoutDescriptor` object can contain these properties: `requestTimeoutMs` (`number`): The maximum time, in milliseconds, that the client should wait for each underlying HTTP request. Default: The default value for the collection. This default is 10 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. `generalMethodTimeoutMs` (`number`): The maximum time, in milliseconds, that the whole operation can take. Since this method issues a single HTTP request, `generalMethodTimeoutMs` and `requestTimeoutMs` are equivalent. If you specify both, the minimum of the two will be used. Default: The default value for the collection. This default is 30 seconds unless you specified a different default when you initialized the `Collection` or `DataAPIClient` object. If you specify a number instead of a `TimeoutDescriptor` object, that number will be applied to both `requestTimeoutMs` and `generalMethodTimeoutMs`.

Use the findAndRerank method, which belongs to the com.datastax.astra.client.Collection class.

Method signature

CollectionFindAndRerankCursor<T, R> findAndRerank(
    Filter filter,
    CollectionFindAndRerankOptions options,
    Class<R> newRowType
);

CollectionFindAndRerankCursor<T, T> findAndRerank(
    Filter filter,
    CollectionFindAndRerankOptions options
);

CollectionFindAndRerankCursor<T,T> findAndRerank(
    CollectionFindAndRerankOptions options
);

Name Type Summary

Name	Type	Summary
`filter`	`Filter`	An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.
`options`	`CollectionFindAndRerankOptions`	Optional. The options for this operation. See Properties of `options` for more details. Most of the options may be provided as either values in the options object, or as builder methods on the cursor itself (`cursor.sort(…)`).

filter

Filter

An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

options

CollectionFindAndRerankOptions

Optional. The options for this operation. See Properties of options for more details.

Most of the options may be provided as either values in the options object, or as builder methods on the cursor itself (cursor.sort(…)).

Properties of `options`
Name	Type	Summary
`sort`	`Sort`	Specifies queries for the underlying vector and lexical searches. The `$lexical` query is a string of space-separated keywords or terms. The `$vector` query is an array of floats or a `DataAPIVector` object that serves as a search vector. If you use this query, you must specify the `rerankQuery` and `rerankOn` parameters. The `$vectorize` query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use `$vectorize`. `$vector` and `$vectorize` can’t be used together. You can also use shorthand to specify a single search string for both the `$vectorize` and `$lexical` queries. See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.
`projection()`	`Projection`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`limit`	`number`	Optional. Limit the total number of documents returned. Once `limit` is reached, or the cursor is exhausted due to lack of matching documents, nothing more is returned. Default: The limit set by the Data API.
`hybridLimits`	`Integer \| Map<String, Integer>`	Optional. Limit the number of documents returned by the underlying vector and lexical searches. If a single number is specified, it applies to both the vector and lexical searches. To set different limits for the vector and lexical searches, specify an object in the form `{$vector: INTEGER, $lexical: INTEGER}`. Default: The value of `limit`.
`includeScores`	`boolean`	Optional. Whether to include the scores from the reranking process in the response. These scores can be inspected in the `scores` attribute of each `RerankedResult<R>` object yielded by the `CollectionFindAndRerankCursor`. This attribute is a free-form object such as `{ $vector: 0.81, $rerank: 0.12 }`. If false, the `scores` attribute of each `RerankedResult` object is an empty object. Default: False
`includeSortVector`	`boolean`	Optional. Whether to include the sort vector that was used for the underlying vector search in the response. This can be useful if you query through the `$vectorize` field instead of the `$vector` field, since you don’t know the sort vector in advance. The sort vector can be read by calling the `.getSortVector()` method on the returned cursor. Default: False
`rerankOn`	`string`	Required if you use `$vector` in `sort`; otherwise optional. The document field to use for the reranking step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. The reserved `$lexical` field is often used for this parameter, but you can specify any field that stores a string. Any document lacking the field is excluded. Default unless you use `$vector` in `sort`: `"$lexical"`.
`rerankQuery`	`string`	Required if you use `$vector` in `sort`; otherwise optional. Query text for the reranker step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. Default unless you use `$vector` in `sort`: the query used for the underlying vector search, which is specified by the `sort` parameter.

Use the findAndRerank command.

Command signature

curl -sS -L -X POST "API_ENDPOINT/api/json/v1/KEYSPACE_NAME/COLLECTION_NAME" \
--header "Token: APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
    "findAndRerank": {
        "filter": FILTER,
        "options": {
            "hybridLimits": HYBRID_LIMITS,
            "includeScores": BOOLEAN,
            "includeSortVector": BOOLEAN,
            "limit": INTEGER,
            "rerankOn": STRING,
            "rerankQuery": STRING
        },
        "projection": PROJECTION,
        "sort": SORT
    }
}'

Name Type Summary

Name	Type	Summary
`filter`	`object`	Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria. You must use `&` to escape any `.` or `&` in field names in the filter clause. You cannot use `&` to escape any other characters. For a list of available filter operators and more examples, see Filter operators for collections. Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter. Default: No filter, meaning any document is a possible match.
`sort`	`object`	Specifies queries for the underlying vector and lexical searches. The `$lexical` query is a string of space-separated keywords or terms. The `$vector` query is an array of floats that serves as a search vector. If you use this query, you must specify the `rerankQuery` and `rerankOn` parameters. The `$vectorize` query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use `$vectorize`. `$vector` and `$vectorize` can’t be used together. You can also use shorthand to specify a single search string for both the `$vectorize` and `$lexical` queries. See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.
`projection`	`object`	Optional. Controls which fields are included or excluded in the returned document. You must use `&` to escape any `.` or `&` in field names in the projection clause. You cannot use `&` to escape any other characters. For more information, see Projections for collections. Default: The default projection for the collection. All fields prefixed with `$` are excluded by default and will only be returned if you include them in the projection. `_id` is included by default and will always be returned unless you exclude them from the projection.
`options`	`object`	Optional. The options for this operation. See Properties of `options` for more details.

filter

object

Optional. An object that defines filter criteria using the Data API filter syntax. The method only finds documents that match the filter criteria.

You must use & to escape any . or & in field names in the filter clause. You cannot use & to escape any other characters.

For a list of available filter operators and more examples, see Filter operators for collections.

Filters can use only indexed fields. If you apply selective indexing when you create a collection, you cannot reference non-indexed fields in a filter.

Default: No filter, meaning any document is a possible match.

sort

object

Specifies queries for the underlying vector and lexical searches.

The $lexical query is a string of space-separated keywords or terms.
The $vector query is an array of floats that serves as a search vector. If you use this query, you must specify the rerankQuery and rerankOn parameters.
The $vectorize query is a string that the configured embedding provider will convert into a search vector. Only collections that have vectorize enabled can use $vectorize.

$vector and $vectorize can’t be used together.

You can also use shorthand to specify a single search string for both the $vectorize and $lexical queries.

See Find documents with a hybrid search and Use shorthand to specify a single search string for examples.

projection

object

Optional. Controls which fields are included or excluded in the returned document.

You must use & to escape any . or & in field names in the projection clause. You cannot use & to escape any other characters.

For more information, see Projections for collections.

options

object

Optional. The options for this operation. See Properties of options for more details.

Properties of `options`
Name	Type	Summary
`limit`	`integer`	Optional. Limit the total number of documents returned. Default: The limit set by the Data API.
`hybridLimits`	`integer \| object`	Optional. Limit the number of documents returned by the underlying vector and lexical searches. If a single number is specified, it applies to both the vector and lexical searches. To set different limits for the vector and lexical searches, specify a dictionary in the form `{"$vector": INTEGER, "$lexical": INTEGER}`. Default: The value of `limit`.
`includeScores`	`boolean`	Optional. Whether to include the scores from the reranking process in the response. These scores are returned in the `status.documentResponses` property of the response as a list of objects. The list is index matched to the list of returned documents. Each score is an object such as `{"$vector": 0.81, "$rerank": 0.12}`. See the examples for usage. Default: False
`includeSortVector`	`boolean`	Optional. Whether to include the sort vector that was used for the underlying vector search in the response. This can be useful if you query through the `$vectorize` field instead of the `$vector` field, since you don’t know the sort vector in advance. The sort vector, if requested, is returned in the `status.sortVector` property of the response. Default: False
`rerankOn`	`string`	Required if you use `$vector` in `sort`; otherwise optional. The document field to use for the reranking step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. The reserved `$lexical` field is often used for this parameter, but you can specify any field that stores a string. Any document lacking the field is excluded. Default unless you use `$vector` in `sort`: `"$lexical"`.
`rerankQuery`	`string`	Required if you use `$vector` in `sort`; otherwise optional. Query text for the reranker step. Once the underlying vector and lexical searches complete, the reranker compares the `rerankQuery` text with each document’s `rerankOn` field. Default unless you use `$vector` in `sort`: the query used for the underlying vector search, which is specified by the `sort` parameter.

Examples

The following examples demonstrate how to find documents with hybrid search.

Find documents with a hybrid search

Python
TypeScript
Java
curl

With $vectorize
Without $vectorize

Use the sort parameter to specify the queries for the underlying vector search and lexical search.