Create vector indexes

Vector search uses Storage-Attached Indexing (SAI) to index and search vector data. This page describes how to create tables with vector columns and the indexes required for vector search.

Vector data type

The CQL vector data type is used to store vector data. It supports vectors of 32-bit floating point numbers with between 1 to 65,535 dimensions. This example creates a table with a vector column of 4 dimensions; production use cases typically use vectors with many more dimensions.

CREATE TABLE products (
    product_id UUID PRIMARY KEY,
    categories SET<TEXT>,
    name TEXT,
    price DECIMAL,
    description VECTOR<FLOAT, 4>
);

Index a vector column

To enable vector search, you must create an SAI index on a vector column. The syntax for creating an SAI index on a vector column is the same as for other data types.

CREATE CUSTOM INDEX products_idx
  ON products (description) USING 'StorageAttachedIndex';

Choose a similarity function

You can choose a similarity function when you create an SAI index on a vector column. If you do not specify a similarity function, the default is cosine. Once selected, the similarity function cannot be changed without dropping and recreating the index. Similarity funtions are also known as similarity metrics.

The supported similarity functions are:

Metric Description

cosine

Default metric, calculates the cosine of the angle between two vectors.

dot_product

Compares vectors by calculating their dot products. More efficient than cosine for normalized vectors.

The dot_product metric may give incorrect results and for non-normalized vectors (vectors with magnitude 1).

euclidean

Calculates the Euclidean distance between two vectors.

Use the WITH OPTIONS clause to specify a similarity function when you create an SAI index on a vector column.

CREATE CUSTOM INDEX products_idx
  ON products (description) USING 'StorageAttachedIndex'
  WITH OPTIONS = {'similarity_function': 'dot_product'};

Insert vector data

You can insert vector data using the CQL INSERT statement. 'vector' literals are comma-delimited lists of floating-point values enclosed in square brackets ([]). This example inserts four rows with 'vector' data into the products table.

INSERT INTO products (product_id, categories, name, price, description)
  VALUES (uuid(), {'electronics', 'audio'}, 'Wireless Headphones', 79.99,
  [0.12, 0.34, 0.56, 0.78]);

INSERT INTO products (product_id, categories, name, price, description)
  VALUES (uuid(), {'electronics', 'gaming'}, 'Gaming Mouse', 49.99,

  [0.22, 0.18, 0.91, 0.44]);

INSERT INTO products (product_id, categories, name, price, description)
  VALUES (uuid(), {'home', 'kitchen'}, 'chopping board', 89.50,
  [0.05, 0.67, 0.33, 0.21]);

INSERT INTO products (product_id, categories, name, price, description)
  VALUES (uuid(), {'electronics', 'fitness', 'health'}, 'heart monitor', 25.00,
  [0.88, 0.12, 0.45, 0.66]);

Was this helpful?

Give Feedback

How can we improve the documentation?

© Copyright IBM Corporation 2026 | Privacy policy | Terms of use Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: Contact IBM