Query vector data with CQL
You can use the vector
data type in Cassandra Query Language (CQL) to enable vector searches of your data.
Using CQL, you can create a schema and an index, load vector data into your database, and use CQL to perform a vector search.
Your Serverless (Vector) database is ready to query vector data with CQL.
Prerequisites
-
You have an active Astra account.
-
You have an active Serverless (Vector) database.
-
You connected to your database with the embedded CQL shell (CQLSH) or the standalone CQLSH.
Create the vector schema
-
In the CQLSH, select the keyspace to use for your vector search table.
This example uses
default_keyspace
as the keyspace name.USE default_keyspace;
-
Create a new table in your keyspace with a five-dimensional vector column.
CREATE TABLE IF NOT EXISTS default_keyspace.products ( id int PRIMARY KEY, name TEXT, description TEXT, item_vector VECTOR<FLOAT, 5> // create a five-dimensional embedding );
-
Create the index:
CREATE INDEX IF NOT EXISTS ann_index ON default_keyspace.products(item_vector) WITH OPTIONS = {'source_model': 'other'};
The
source_model
option configures the index with the fastest settings for a given source of embeddings vectors.source_model
options areopenai-v3-large
,openai-v3-small
,ada002
,gecko
, 'nv-qa-4', 'cohere-v3',bert
, andother
. The default isother
.To change index settings, you must drop and rebuild the index. For more information about SAI, see Storage Attached Indexing overview.
You can also choose a specific similarity function for your index. If you selected a
source_model
, you don’t need to include asimilarity_function
.CREATE INDEX IF NOT EXISTS ann_index ON default_keyspace.products(item_vector) WITH OPTIONS = {'similarity_function': 'DOT_PRODUCT'};
Valid values for the
similarity_function
areCOSINE
(default),DOT_PRODUCT
, orEUCLIDEAN
.
Load the data into the database
Insert sample data into the table using the new item_vector
type:
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
1, // id
'Coded Cleats', // name
'Chat bot integrated sneakers that talk to you', // description
[0.1, 0.15, 0.3, 0.12, 0.05] // item_vector
);
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
2,
'Logic Layers',
'An AI quilt to help you sleep forever',
[0.45, 0.09, 0.01, 0.2, 0.11]
);
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
5,
'Vision Vector Frame',
'A deep learning display that controls your mood',
[0.1, 0.05, 0.08, 0.3, 0.6]
);
Query vector data with CQL
To query data using vector search, use a SELECT
query:
SELECT * FROM default_keyspace.products
ORDER BY item_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55]
LIMIT 1;
Calculate the similarity
You can calculate the similarity of the best scoring row in a table using a vector search query. For applications where similarity and relevance are crucial, this calculation helps you make informed decisions. This calculation enables algorithms to provide more tailored and accurate results.
The supported functions for this type of query are similarity_dot_product
, similarity_cosine
, and similarity_euclidean
.
You can use this query with the VECTOR_COLUMN
and EMBEDDING_VALUE
parameters, which represent vectors.
Use a SELECT
query to find the row that is most similar to the vector in the search query.
SELECT description, similarity_cosine(item_vector, [0.1, 0.15, 0.3, 0.12, 0.05])
FROM default_keyspace.products
ORDER BY item_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05]
LIMIT 1;
What’s next?
Learn how to filter your vector search by specific terms. For more, see Use analyzers with CQL.
See also
For more about CQL, see these topics: