Perform a vector search
After you load data into a collection, you can use the Data Explorer in the Astra Portal to view your data, search for similar vectors, and filter by metadata.
A vector search determines the similarity between a query vector and the vectors of the documents in the collection. Each document’s resulting similarity score represents the closeness of the query vector and the document’s vector.
To perform a vector search, you need a role that can view the database and collection that you want to search. To perform a vector search with the Data API, you need an application token with this role. You can use a built-in role or a custom role with the following permissions: View DB, Describe All Keyspaces, Describe Keyspace, Select Table, and Describe Table.
Search your data
You can use the Astra Portal or the Data API to perform a vector search.
-
Astra Portal
-
Python
-
TypeScript
-
Java
-
curl
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
Select the Keyspace and Collection that contain the data you want to search.
In the Collection Data section, you can view data as a table or as a list of JSON documents.
-
In the Vector Search field, enter a vector array to query.
-
Query a vector in the collection: In the Collection Data section, in the Vector Search column, click Search to run a similarity search based on the selected document’s vector.
-
Query an external vector: To use a vector generated outside the collection, enter the vector array in the Vector Search field, and then click Apply.
The Collection Data section sorts the data based on the calculated similarity score for each document, from most similar to least similar. Similarity scores are based on the similarity metric that you chose when you created the collection.
-
-
Optional: Use the Similarity Scores field to limit the total number of search results.
-
Optional: Use metadata filters to refine the search results based on other fields in the collection:
-
Click Add Filter, and then configure the filter:
-
Key: Select the field to filter on.
-
Condition: Select the filter operator to use. The
is
condition performs an exact match of a scalar or value within an array, and thecontains
condition performs an exact match of a value within an array. Some data types have a default condition.The Data API supports more operators than the Astra Portal. If you need more filtering options, consider using the Data API clients.
-
Value: Enter a filter value.
All conditions are case-sensitive and the filter value must be an exact match.
Filter example:
is
For this example, assume that you have the following filter:
-
Key:
character
-
Condition:
is
-
Value:
Lassie
This filter returns all documents with a
character
field set to a scalar value of"Lassie"
or set to an array containing a value of"Lassie"
.This matches values like
"Lassie"
and["Lassie", "Timmy"]
, but this does not match values like"lassie"
,"Lassie Come Home"
, or["lassie", "Timmy"]
.Filter example:
contains
For this example, assume that you have the following filter:
-
Key:
color
-
Condition:
contains
-
Value:
red
This filter returns all documents with a
color
field set to an array containing a value of"red"
.This matches values like
["red", "blue", "green"]
and["red"]
, but this does not match values like"red"
,["reddish", "Red", "Green", "Blue"]
, or["green", "blue"]
. -
-
-
To add more filters, click Add Filter again.
-
Click Apply to refresh the Collection Data section based on your filters.
-
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search:
# Perform a similarity search.
query_vector = [0.15, 0.1, 0.1, 0.35, 0.55]
results = collection.find(
sort={"$vector": query_vector},
limit=10,
include_similarity=True,
)
print("Vector search results:")
for document in results:
print(" ", document)
Perform a vector search with metadata filters:
# Perform a similarity search with metadata filters
query_vector = [0.15, 0.1, 0.1, 0.35, 0.55]
results = collection.find(
{"$and": [
{"price": {"$gte": 100}},
{"name": "John"}
]},
sort={"$vector": query_vector},
limit=10,
projection={"*": True},
)
print("Vector search results:")
for document in results:
print(" ", document)
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search:
// Perform a similarity search
const cursor = await collection.find({}, {
sort: { $vector: [0.15, 0.1, 0.1, 0.35, 0.55] },
limit: 10,
includeSimilarity: true,
});
console.log('* Search results:');
for await (const doc of cursor) {
console.log(' ', doc.idea, doc.$similarity);
}
Perform a vector search with metadata filters:
// Perform a similarity search with metadata filters
const cursor = await collection.find({
$and: [
{ price: { $gte: 100 } },
{ name: 'John' }
]
}, {
sort: { $vector: [0.15, 0.1, 0.1, 0.35, 0.55] },
limit: 10,
includeSimilarity: true,
});
console.log('* Search results:')
for await (const doc of cursor) {
console.log(' ', doc);
}
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search:
// Perform a similarity search
FindIterable<Document> resultsSet = collection.find(
new float[]{0.15f, 0.1f, 0.1f, 0.35f, 0.55f},
10
);
resultsSet.forEach(System.out::println);
Perform a vector search with metadata filters:
// Perform a similarity search with metadata filters
FindIterable<Document> resultsSet = collection.find(
Filters.and(
Filters.gte("price", 100),
Filters.eq("name", "John")
),
new float[]{0.15f, 0.1f, 0.1f, 0.35f, 0.55f},
10
);
resultsSet.forEach(System.out::println);
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search:
# Perform a similarity search
curl -sS -L -X POST "$ASTRA_DB_API_ENDPOINT/api/json/v1/default_keyspace/vector_test" \
--header "Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"options": {
"limit": 10
}
}
}' | jq
Perform a vector search with metadata filters:
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {
"$and": [
{ "customer.credit_score": { "$gte": 700 } },
{ "customer.credit_score": { "$lt": 800 } }
]
}
"sort": { "$vector": [0.15, 0.1, 0.1, 0.35, 0.55] },
"options": {
"limit": 100
}
}
}' | jq
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
Search your data with vectorize
For collections that auto-generate embeddings with vectorize, you can perform a similarity search using text, rather than a vector. Vectorize generates an embedding for your text query, and then performs a similarity search based on that embedding.
You can use the Astra Portal or the Data API to perform a search with vectorize.
-
Astra Portal
-
Python
-
TypeScript
-
Java
-
curl
-
In the Astra Portal, go to Databases, and then select your Serverless (Vector) database.
-
Click Data Explorer.
-
Select the Keyspace and Collection that contain the data you want to view.
In the Collection Data section, the ($vectorize) label indicates the field that you designated to auto-generate embeddings for this collection’s documents. The $vector field contains the generated embeddings.
-
In the Vector Search field, enter a text query, and then click Apply.
Using the collection’s embedding provider integration, Astra DB generates a vector for your text query, and then performs a similarity search.
The Collection Data section sorts the data based on the calculated similarity score for each document, from most similar to least similar. Similarity scores are based on the similarity metric that you chose when you created the collection.
-
Optional: Use metadata filters to refine the search results based on other fields in the collection:
-
Click Add Filter, and then configure the filter:
-
Key: Select the field to filter on.
-
Condition: Select the filter operator to use. The
is
condition performs an exact match of a scalar or value within an array, and thecontains
condition performs an exact match of a value within an array. Some data types have a default condition.The Data API supports more operators than the Astra Portal. If you need more filtering options, consider using the Data API clients.
-
Value: Enter a filter value.
All conditions are case-sensitive and the filter value must be an exact match.
Filter example:
is
For this example, assume that you have the following filter:
-
Key:
character
-
Condition:
is
-
Value:
Lassie
This filter returns all documents with a
character
field set to a scalar value of"Lassie"
or set to an array containing a value of"Lassie"
.This matches values like
"Lassie"
and["Lassie", "Timmy"]
, but this does not match values like"lassie"
,"Lassie Come Home"
, or["lassie", "Timmy"]
.Filter example:
contains
For this example, assume that you have the following filter:
-
Key:
color
-
Condition:
contains
-
Value:
red
This filter returns all documents with a
color
field set to an array containing a value of"red"
.This matches values like
["red", "blue", "green"]
and["red"]
, but this does not match values like"red"
,["reddish", "Red", "Green", "Blue"]
, or["green", "blue"]
. -
-
-
To add more filters, click Add Filter again.
-
Click Apply to refresh the Collection Data section based on your filters.
-
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search with vectorize:
# Perform a similarity search
query = "I'd like some talking shoes"
results = collection.find(
sort={"$vectorize": query},
limit=2,
projection={"$vectorize": True},
include_similarity=True,
)
print(f"Vector search results for '{query}':")
for document in results:
print(" ", document)
Perform a vector search with vectorize and metadata filters:
# Perform a similarity search with metadata filters
query = "I'd like some talking shoes"
results = collection.find(
{"$and": [
{"price": {"$gte": 100}},
{"name": "John"}
]},
sort={"$vectorize": query},
limit=10,
projection={"$vectorize": True},
)
print("Vector search results:")
for document in results:
print(" ", document)
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search with vectorize:
// Perform a similarity search
const cursor = await collection.find({}, {
sort: { $vectorize: 'shoes' },
limit: 2,
includeSimilarity: true,
});
console.log('* Search results:')
for await (const doc of cursor) {
console.log(' ', doc.text, doc.$similarity);
}
Perform a vector search with vectorize and metadata filters:
// Perform a similarity search with metadata filters
const cursor = await collection.find({
$and: [
{ price: { $gte: 100 } },
{ name: 'John' }
]
}, {
sort: { $vectorize: 'shoes' },
limit: 10,
includeSimilarity: true,
});
console.log('* Search results:')
for await (const doc of cursor) {
console.log(' ', doc);
}
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search with vectorize:
// Perform a similarity search
FindOptions findOptions = new FindOptions()
.limit(2)
.includeSimilarity()
.sort("I'd like some talking shoes");
FindIterable<Document> results = collection.find(findOptions);
for (Document document : results) {
System.out.println("Document: " + document);
}
You can use metadata filters with a vectorize vector search in the same way that you would with a regular vector search.
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.
For more information about this command and related commands, see Find a document, Find documents, and Find distinct values. For a complete list of filter conditions, see Data API query operators.
Perform a vector search with vectorize:
# Perform a similarity search
curl -sS -L -X POST "$ASTRA_DB_API_ENDPOINT/api/json/v1/default_keyspace/pass:q[**COLLECTION_NAME**]" \
--header "Token: $ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"sort": {"$vectorize": "Talking shoes"},
"projection": {"$vectorize": 1},
"options": {
"includeSimilarity": true,
"limit": 10
}
}
}' | jq
Perform a vector search with vectorize and metadata filters:
curl -sS -L -X POST "ASTRA_DB_API_ENDPOINT/api/json/v1/ASTRA_DB_KEYSPACE/ASTRA_DB_COLLECTION" \
--header "Token: ASTRA_DB_APPLICATION_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"find": {
"filter": {
"$and": [
{ "customer.credit_score": { "$gte": 700 } },
{ "customer.credit_score": { "$lt": 800 } }
]
}
"sort": { "$vectorize": "green car" },
"options": {
"limit": 100
}
}
}' | jq
For vector ANN search, the response is a single page of up to 1000 documents, unless you set a lower limit
.
You can use a projection to include specific document properties in the response.
A projection is required if you want to return certain reserved fields, like $vector
and $vectorize
, that are excluded by default.