Java driver quickstart

DataStax recommends the Java client for Serverless (Vector) databases. Use the Java driver only if you are working with an existing application that previously used a CQL-based driver or if you plan to explicitly use CQL.

Review the Connection methods comparison page to determine the option that best suits your use case.

This quickstart provides an end-to-end workflow for how to use the Java driver to connect to your database, load a set of vector embeddings, and perform a similarity search to find vectors that are close to the one in your query.

Prerequisites

You need the following items to complete this quickstart:

Add dependencies

Add the required Java driver dependency to your Maven pom.xml file. Version 4.17.0+ is needed to support Serverless (Vector) databases.

<dependency>
  <groupId>com.datastax.oss</groupId>
  <artifactId>java-driver-core</artifactId>
  <version>4.17.0</version>
</dependency>

Import libraries and connect to the database

Import the necessary Java libraries and establish a connection to the database using the CqlSession builder.

import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
import com.datastax.oss.driver.api.core.CqlSessionBuilder;
import com.datastax.oss.driver.api.core.cql.PreparedStatement;
import com.datastax.oss.driver.api.core.data.CqlVector;
import com.datastax.oss.driver.api.core.type.codec.TypeCodecs;

import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;

public class VectorTest {
    public static void main(String[] args) {
        // Initialize the Java driver
        String keyspace = "default_keyspace";
        CqlSessionBuilder builder = CqlSession.builder();
        builder.withCloudSecureConnectBundle(Paths.get(System.getenv("ASTRA_DB_SECURE_BUNDLE_PATH")));
        builder.withAuthCredentials("token", System.getenv("ASTRA_DB_APPLICATION_TOKEN"));
        builder.withKeyspace(keyspace);

        try (CqlSession session = builder.build()) {
            int v_dimension = 5;
            // ...
        }
    }
}

Create a table and vector-compatible Storage Attached Index (SAI)

Define a new table that is compatible with vector data and create an SAI for efficient queries.

// ...
            session.execute(String.format(
                "CREATE TABLE IF NOT EXISTS vector_test (id INT PRIMARY KEY, " +
                "text TEXT, vector VECTOR<FLOAT,%d>);",
                v_dimension)
            );

            session.execute(String.format(
                "CREATE CUSTOM INDEX IF NOT EXISTS idx_vector_test ON vector_test " +
                "(vector) USING 'StorageAttachedIndex' WITH OPTIONS = {'similarity_function' : 'cosine'};")
            );
// ...

Load data

Insert a few documents with embeddings into the collection.

// ...
            List<Object[]> textBlocks = Arrays.asList(
                new Object[]{1, "ChatGPT integrated sneakers that talk to you", CqlVector.newInstance(Arrays.asList(0.1f, 0.15f, 0.3f, 0.12f, 0.05f))},
                new Object[]{2, "An AI quilt to help you sleep forever", CqlVector.newInstance(Arrays.asList(0.45f, 0.09f, 0.01f, 0.2f, 0.11f))},
                new Object[]{3, "A deep learning display that controls your mood", CqlVector.newInstance(Arrays.asList(0.1f, 0.05f, 0.08f, 0.3f, 0.6f))}
            );

            PreparedStatement ps = session.prepare(String.format(
                "INSERT INTO vector_test (id, text, vector) VALUES (?, ?, ?)")
            );
            for (Object[] block : textBlocks) {
                session.execute(ps.bind(block));
            }
// ...

Insert sample data with text and vector embeddings into the table. Then, find documents that are close to a specific vector embedding.

// ...
            String annQuery = String.format(
                "SELECT id, text, similarity_cosine(vector, [0.15, 0.1, 0.1, 0.35, 0.55]) as sim " +
                "FROM vector_test " +
                "ORDER BY vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] LIMIT 2"
            );

            ResultSet rs = session.execute(annQuery);
            for (Row row : rs) {
                System.out.printf("[%d] \"%s\" (sim: %.4f)\n", row.getInt("id"), row.getString("text"), row.getFloat("sim"));
            }
        }
    }
}
// ...

You have now added the necessary dependencies, connected the Java driver to your database, created a table with a vector-compatible SAI index, loaded sample data, and performed a similarity search to find vectors that are close to the one in your query.

Resources

See the Apache Software Foundation Java Driver documentation for details about connectivity, query execution, API conventions, and other topics.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com