Vector search quickstart

Vector search is a foundational use case for vector databases.

This guide introduces vector search, explains how vector search works, and demonstrates how to perform a vector search with CQL.

A vector database is optimized for performing similarity search, also called vector search. In a vector search, the database compares high-dimensional vector embeddings to retrieve items most similar to a query vector.

Here’s how vector search works:

  1. Generate vector embeddings for your data, and then load the data, along with the embeddings, into a table.

    Vector embeddings are stored in a column of type vector alongside related non-vector data, which is also known as metadata. For more information, see [chunking].

  2. Generate an embedding for a new piece of content outside the original collection. For example, you could generate an embedding from a text query submitted by a user to a chatbot.

    Make sure you use the same embedding model for your original embeddings and your new embedding.

  3. Use the new embedding to run a vector search on the collection and find data that is most similar to the new content.

    Mechanically, a vector search determines the similarity between a query vector and the vectors of the documents in a collection. Each document’s resulting similarity score represents the closeness of the query vector and the document’s vector.

  4. Use the returned content to produce a response or trigger an action in an LLM or GenAI application. For example, a support chatbot could use the result of a vector search to generate an answer to a user’s question.

While vectors can replace or augment some functions of metadata filters, vectors are not a replacement for other data types. Vector search can be powerful, particularly when combined with keyword filters, but it is important to be aware of its limitations:

  • Vectors aren’t human-readable. They must be interpreted by an LLM, and then transformed into a human-readable response.

  • Vector search isn’t meant to directly retrieve specific data. By design, vector search finds similar data.

    For example, assume you have a database for your customer accounts. If you want to retrieve data for a single, specific customer, it is more appropriate to use a filter to exactly match the customer’s ID instead of a vector search.

  • Vector search is a mathematical approximation. By design, vector search uses mathematical calculations to find data that is mathematically similar to your query vector, but this data may not be the most contextually relevant from a human perspective.

    For example, assume you have database for a department store inventory. A vector search for green hiking boots could return a mix of hiking boots, other types of boots, and other hiking gear.

    Use metadata filters to narrow the context window and potentially improve the relevance of vector search results. For example, you can improve the hiking boots vector search by including a metadata filter like productType: "shoes".

  • The embedding model matters. It’s important to choose an embedding model that is ideal for your data, your queries, and your performance requirements. Embedding models exist for different data types (such as text, images, or audio), languages, use cases, and more.

    Using an inappropriate embedding model can lead to inaccurate or unexpected results from vector searches. For example, if your dataset is in Spanish, and you choose an English language embedding model, then your vector search results could be inaccurate because the embedding model attempts to parse the Spanish text in the context of the English words that it was trained on.

    Additionally, you must use the same model for your stored vectors and query vectors because mismatched embeddings can cause a vector search to fail or produce inaccurate results.

Approximate nearest neighbor

Cassandra-based databases support only approximate nearest neighbor (ANN) vector searches, not exact nearest neighbors (KNN) searches.

ANN search finds the most similar content within reason for efficiency, but it might not find the exact most similar match. Although precise, KNN is resource intensive, and it is not practical for applications that need quick responses from large datasets. ANN balances accuracy and performance, making it a better choice for applications that query large datasets or need to respond quickly.

To learn more about ANN and KNN, see What is the k-nearest neighbors (KNN) algorithm.

Indexing

Databases based on Apache Cassandra® provide numeric-, text-, and vector-based indexes to support different kinds of searches. You can customize indexes based on your requirements, such as specific similarity functions or text transformations.

Storage-Attached Indexing (SAI) is a highly-scalable, globally-distributed index for Cassandra databases. With SAI, searches can efficiently find rows that satisfy query predicates. It is ideal for large datasets, particularly those that must support vector search.

When you run a search, SAI loads a superset of all possible results from storage based on the predicates you provide. SAI evaluates the search criteria, sorts the results by vector similarity, and then returns the top limit results to you.

SAI uses the JVector approximate nearest neighbor (ANN) search algorithm for similarity search. By design, ANN prioritizes speed and efficiency over exact accuracy. Taking inspiration from DiskANN, JVector balances speed and accuracy by creating a hierarchy of navigable graph indexes. All data points, or nodes, on the graph, can find a path to any other node.

When you insert data, JVector adds those new documents to the graph immediately, so you can efficiently search right away.

To save space and improve performance, JVector can compress vectors with quantization.

For more information, see SAI performance and use cases.

Run a vector search with CQL

To run a vector search with CQL, you need to prepare your database for vector search and then run a vector search query.

Vector search requires the vector data type, which is available for the following databases:

  • Astra DB

  • Hyper-Converged Database (HCD)

  • DSE 6.9 or later

  • Cassandra 5.0 or later.

    1. Use an existing keyspace or create a keyspace for this quickstart. If you want to follow along with the examples, name your keyspace cycling.

      • HCD, DSE, or Apache Cassandra

      • Astra DB

      Use the CREATE KEYSPACE command to create a keyspace named cycling:

      CREATE KEYSPACE IF NOT EXISTS cycling
      WITH REPLICATION = {
        'class' : 'SimpleStrategy',
        'replication_factor' : 1
      };

      CQL for Astra DB doesn’t support the CREATE KEYSPACE command. Use the Astra Portal or the Astra DB DevOps API to create a keyspace named cycling.

      After creating your keyspace, launch the cqlsh and connect to your database. For instructions, see Connect to the CQL shell for Astra DB.

    2. Select the keyspace that you want to use for this quickstart:

        USE cycling;
    3. Create a table called comments_vs to store the demo data for this quickstart.

      Vector data is stored alongside related non-vector data, which is also known as metadata. The vector embeddings are stored in a column of type vector. For more information, see [embeddings].

      CREATE TABLE IF NOT EXISTS cycling.comments_vs (
        record_id timeuuid,
        id uuid,
        commenter text,
        comment text,
        comment_vector VECTOR <FLOAT, 5>,
        created_at timestamp,
        PRIMARY KEY (id, created_at)
      )
      WITH CLUSTERING ORDER BY (created_at DESC);

      You can also add a vector column to an existing table:

      ALTER TABLE cycling.comments_vs
        ADD comment_vector VECTOR <FLOAT, 5>; (1)
      1 In this example, the vector uses the float data type and specifies the array dimension of 5 to store the embeddings. In Apache Cassandra 5.0 and later, the vector data type is a built-in type that supports vectors of type float as well as vectors of arbitrary subtype.
    4. Index the vector column by creating a custom index with Storage Attached Indexing (SAI). For this example, the index is named comment_vector.

      CREATE CUSTOM INDEX comment_ann_idx ON cycling.comments_vs(comment_vector) 
        USING 'StorageAttachedIndex';

      For more information, see Indexing.

    5. Insert vector and non-vector data into the table:

      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),e7ae5cf3-d358-4d99-b900-85902fda9bb0, 'Alex','Raining too hard should have postponed','2017-02-14 12:43:20-0800',[0.45, 0.09, 0.01, 0.2, 0.11]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),e7ae5cf3-d358-4d99-b900-85902fda9bb0,'Alex','Second rest stop was out of water','2017-03-21 13:11:09.999-0800',[0.99, 0.5, 0.99, 0.1, 0.34]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),e7ae5cf3-d358-4d99-b900-85902fda9bb0,'Alex','LATE RIDERS SHOULD NOT DELAY THE START','2017-04-01 06:33:02.16-0800',[0.9, 0.54, 0.12, 0.1, 0.95]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-4207-9494-a29f9809de6f,'Amy','The gift certificate for winning was the best',totimestamp(now()),[0.13, 0.8, 0.35, 0.17, 0.03]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-7207-9494-a29f9809de6f,'Amy','The <B>gift certificate</B> for winning was the best',totimestamp(now()),[0.13, 0.8, 0.35, 0.17, 0.03]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-4207-9494-a29f9809de6f,'Amy','Glad you ran the race in the rain','2017-02-17 12:43:20.234+0400',[0.3, 0.34, 0.2, 0.78, 0.25]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-4207-9594-a29f9809de6f,'Jane','Boy, was it a drizzle out there!','2017-02-17 12:43:20.234+0400',[0.3, 0.34, 0.2, 0.78, 0.25]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(), c7fceba0-c141-3207-9494-a29f9809de6f,'Amy','THE RACE WAS FABULOUS!','2017-02-17 12:43:20.234+0400',[0.3, 0.34, 0.2, 0.78, 0.25]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-4207-9494-a29f9809de6f, 'Amy','Great snacks at all reststops','2017-03-22 5:16:59.001+0400',[0.1, 0.4, 0.1, 0.52, 0.09]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),c7fceba0-c141-4207-9494-a29f9809de6f,'Amy','Last climb was a killer','2017-04-01 17:43:08.030+0400',[0.3, 0.75, 0.2, 0.2, 0.5]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),e8ae5cf3-d358-4d99-b900-85902fda9bb0,'John','rain, rain,rain, go away!','2017-04-01 06:33:02.16-0800',[0.9, 0.54, 0.12, 0.1, 0.95]);
      INSERT INTO cycling.comments_vs (record_id, id, commenter, comment, created_at, comment_vector) VALUES (now(),e8ae5df3-d358-4d99-b900-85902fda9bb0,'Jane','Rain like a monsoon','2017-04-01 06:33:02.16-0800',[0.9, 0.54, 0.12, 0.1, 0.95]);

      Note the format of the vector data type. Vector data must be stored in a valid format so that it can be indexed and searched correctly. Additionally, your embeddings must all originate from the same embedding model and match the dimensionality of your vector index. If embeddings originate from different models, the vector search won’t represent an accurate comparison.

      This example uses randomly generated embeddings to demonstrate the vector search functionality. In a production scenario, you would produce embeddings specifically for your data and your search query.

      For more information, see [embeddings].

Run vector search queries

Vector search works optimally on tables with no overwrites or deletions of the vector column. For a vector column with changes, expect slower search results.

Vector search utilizes approximate nearest neighbor (ANN) that in most cases yields results almost as good as the exact match. The scaling is superior to exact nearest neighbor (KNN). Least-similar searches are not supported. For more information, see How vector search works.

  1. Use a SELECT query to run a standard vector search:

    SELECT * FROM cycling.comments_vs 
      ORDER BY comment_vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] 
      LIMIT 3;

    The results include up to 1,000 rows that are most similar to the given query vector.

    Results
     id                                   | created_at                      | comment                                | comment_vector                                    | commenter | record_id
    --------------------------------------+---------------------------------+----------------------------------------+---------------------------------------------------+-----------+--------------------------------------
     e8ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-04-01 14:33:02.160000+0000 |              rain, rain,rain, go away! | b'?fff?\\n=q=\\xf5\\xc2\\x8f=\\xcc\\xcc\\xcd?s33' |      John | 3d42f050-37da-11ef-81ed-f92c3c7170c3
     e7ae5cf3-d358-4d99-b900-85902fda9bb0 | 2017-04-01 14:33:02.160000+0000 | LATE RIDERS SHOULD NOT DELAY THE START | b'?fff?\\n=q=\\xf5\\xc2\\x8f=\\xcc\\xcc\\xcd?s33' |      Alex | 3d3d9921-37da-11ef-81ed-f92c3c7170c3
     e8ae5df3-d358-4d99-b900-85902fda9bb0 | 2017-04-01 14:33:02.160000+0000 |                    Rain like a monsoon | b'?fff?\\n=q=\\xf5\\xc2\\x8f=\\xcc\\xcc\\xcd?s33' |      Jane | 3d433e71-37da-11ef-81ed-f92c3c7170c3
    
    (3 rows)
  2. To include the similarity score in the results, use a modified SELECT query.

    The supported functions for this type of query are similarity_dot_product, similarity_cosine, and similarity_euclidean with the parameters of (<vector_column>, <embedding_value>). Both parameters represent vectors.

    SELECT  comment, similarity_cosine(comment_vector, [0.2, 0.15, 0.3, 0.2, 0.05]) 
        FROM cycling.comments_vs
        ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05] 
        LIMIT 3;
    Results
     comment                                | system.similarity_cosine(comment_vector, [0.2, 0.15, 0.3, 0.2, 0.05])
    ----------------------------------------+-----------------------------------------------------------------------
          Second rest stop was out of water |                                                              0.949701
                  rain, rain,rain, go away! |                                                              0.789776
     LATE RIDERS SHOULD NOT DELAY THE START |                                                              0.789776
    
    (3 rows)

See also

CassIO for AI workloads

CassIO abstracts away the details of accessing the Cassandra database for the typical needs of generative artificial intelligence (AI) or other machine learning workloads. CassIO offers a low-boilerplate, ready-to-use set of tools for seamless integration of Cassandra in most AI-oriented applications.

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM