Use vector search with Cassandra Query Language (CQL)

This example demonstrates how to use the vector search with Cassandra Query Language (CQL) with Hyper-Converged Database (HCD) 1.2.

To enable your machine learning model, vector search uses data compared by similarity within a database, even if it is not explicitly defined by a connection. A vector is an array of floating point type that represents a specific object or entity.

The foundation of vector search lies within the embeddings, which are compact representations of text as vectors of floating-point numbers. These embeddings are generated by feeding the text through an API, which uses a neural network to transform the input into a fixed-length vector. Embeddings capture the semantic meaning of the text, providing a more nuanced understanding than traditional term-based approaches. The vector representation allows for input that is substantially similar to produce output vectors that are geometrically close; inputs that are not similar are geometrically further apart.

To enable vector search, a vector data type is available in your HCD database with vector search. For more information, see vector search examples in the CQL documentation.

To use vector search in the cloud, check out the Astra DB Serverless vector search quickstart.

Prerequisites

  • Install HCD

  • Install a terminal to run your client

  • Identify your credentials

Install HCD

Go install HCD if you haven’t already. For exploration, use the Docker installation option.

Install a terminal to run your client

The clients can be tested by running them in a terminal. You’ll want Xterm, Terminal, or another terminal emulator.

Identify your credentials

include::ROOT:partial$contents through an embeddings generator, as well as the query you were asking to match. This example is simply to show the mechanics of how to use CQL to create vector search data objects.

Create vector search keyspace

Create a new vector search keyspace called cycling:

CREATE KEYSPACE IF NOT EXISTS cycling
  WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1};

include::ROOT:partial$/cassio.org[CassIO] abstracts away the details of accessing the Cassandra database for the typical needs of generative artificial intelligence (AI) or other machine learning workloads. CassIO offers a low-boilerplate, ready-to-use set of tools for seamless integration of Cassandra in most AI-oriented applications.

For more information, see CassIO.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2025 DataStax, an IBM Company | Privacy policy | Terms of use | Manage Privacy Choices

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com