Query vector data with CQL
You can use the CQL vector
data type to enable vector search on a table.
You can use CQL to create a schema and an index, load vector data into your database, and then perform a vector search.
Prerequisites
To get started with the CQL for Astra DB, see Connect to the CQL shell and Cassandra Query Language (CQL) for Astra DB quickstart.
Create the vector schema
-
In the CQL shell, select the keyspace where you want to store vector data. The following example uses
default_keyspace
.USE default_keyspace;
-
Create a new table in your keyspace that has at least one
vector
column. The following example creates a table with four columns. Thevector
column has a dimensionality of5
, which means it should store five-dimensional vector embeddings.CREATE TABLE IF NOT EXISTS default_keyspace.products ( id int PRIMARY KEY, name TEXT, description TEXT, item_vector VECTOR<FLOAT, 5> );
-
Create a vector index:
CREATE INDEX IF NOT EXISTS ann_index ON default_keyspace.products(item_vector) WITH OPTIONS = {'source_model': 'other'};
The
source_model
option configures the index with the fastest settings for a given embeddings model. The available options areopenai-v3-large
,openai-v3-small
,ada002
,gecko
, 'nv-qa-4', 'cohere-v3',bert
, andother
. The default isother
.Alternatively, you can base the index on a similarity metric:
CREATE INDEX IF NOT EXISTS ann_index ON default_keyspace.products(item_vector) WITH OPTIONS = {'similarity_function': 'DOT_PRODUCT'};
Valid values for the
similarity_function
areCOSINE
(default),DOT_PRODUCT
, orEUCLIDEAN
. If you specified asource_model
, you don’t need to include asimilarity_function
.To change index settings, you must drop and rebuild the index. For more information about indexes, see Storage Attached Indexing overview.
Load the data into the database
Insert data with embeddings:
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
1, // id
'Coded Cleats', // name
'Chat bot integrated sneakers that talk to you', // description
[0.1, 0.15, 0.3, 0.12, 0.05] // item_vector
);
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
2,
'Logic Layers',
'An AI quilt to help you sleep forever',
[0.45, 0.09, 0.01, 0.2, 0.11]
);
INSERT INTO default_keyspace.products (id, name, description, item_vector) VALUES
(
5,
'Vision Vector Frame',
'A deep learning display that controls your mood',
[0.1, 0.05, 0.08, 0.3, 0.6]
);
Query vector data with CQL
Use a SELECT
statement to perform a vector search on your table:
SELECT * FROM default_keyspace.products
ORDER BY item_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55]
LIMIT 1;
Calculate the similarity
You can use a SELECT
statement to calculate the similarity score of the matching row returned by the vector search.
This calculation enables algorithms to provide more tailored and accurate results.
The supported functions for this type of query are similarity_dot_product
, similarity_cosine
, and similarity_euclidean
.
The similarity function you use depends on your embeddings model.
Use a SELECT
query to find the row that is most similar to the query vector, and then calculate the similarity score between the matching row’s vector and the query vector:
SELECT description, similarity_cosine(item_vector, [0.1, 0.15, 0.3, 0.12, 0.05])
FROM default_keyspace.products
ORDER BY item_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05]
LIMIT 1;
Next steps
To use a standard query filter and vector search together, see Use analyzers with CQL.