Java driver quickstart
DataStax recommends using the Java client with HCD databases. Use the Java driver only if you are working with an existing application that previously used a CQL-based driver or if you plan to explicitly use CQL. |
Review the Connection methods comparison page to determine the option that best suits your use case.
This quickstart provides an end-to-end workflow for how to use the Java driver to connect to your database, load a set of vector embeddings, and perform a similarity search to find vectors that are close to the one in your query.
Prerequisites
You need the following items to complete this quickstart:
-
A running HCD database
-
A current Java driver version
Add dependencies
Add the required Java driver dependency to your Maven pom.xml
file.
Version 4.17.0+ is needed to support HCD databases.
<dependency>
<groupId>com.datastax.oss</groupId>
<artifactId>java-driver-core</artifactId>
<version>4.17.0</version>
</dependency>
Import libraries and connect to the database
Import the necessary Java libraries and establish a connection to the database using the CqlSession builder.
import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
import com.datastax.oss.driver.api.core.CqlSessionBuilder;
import com.datastax.oss.driver.api.core.cql.PreparedStatement;
import com.datastax.oss.driver.api.core.data.CqlVector;
import com.datastax.oss.driver.api.core.type.codec.TypeCodecs;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
public class DriverExample {
public static void main(String[] args) {
// Initialize the Java driver
String keyspace = "default_keyspace";
CqlSessionBuilder builder = CqlSession.builder();
builder.withAuthCredentials("user_name","password");
builder.withKeyspace(keyspace);
try (CqlSession session = builder.build()) {
int v_dimension = 5;
// ...
}
}
}
Create a table and vector-compatible Storage Attached Index (SAI)
Define a new table that is compatible with vector data and create an SAI for efficient queries.
session.execute(String.format(
"CREATE TABLE IF NOT EXISTS vector_test (id INT PRIMARY KEY, " +
"text TEXT, vector VECTOR<FLOAT,%d>);",
v_dimension)
);
session.execute(String.format(
"CREATE CUSTOM INDEX IF NOT EXISTS idx_vector_test ON vector_test " +
"(vector) USING 'StorageAttachedIndex' WITH OPTIONS = {'similarity_function' : 'cosine'};")
);
Load data
Insert a few documents with embeddings into the collection.
List<Object[]> textBlocks = Arrays.asList(
new Object[]{1, "ChatGPT integrated sneakers that talk to you", CqlVector.newInstance(Arrays.asList(0.1f, 0.15f, 0.3f, 0.12f, 0.05f))},
new Object[]{2, "An AI quilt to help you sleep forever", CqlVector.newInstance(Arrays.asList(0.45f, 0.09f, 0.01f, 0.2f, 0.11f))},
new Object[]{3, "A deep learning display that controls your mood", CqlVector.newInstance(Arrays.asList(0.1f, 0.05f, 0.08f, 0.3f, 0.6f))}
);
PreparedStatement ps = session.prepare(String.format(
"INSERT INTO vector_test (id, text, vector) VALUES (?, ?, ?)")
);
for (Object[] block : textBlocks) {
session.execute(ps.bind(block));
}
Perform a similarity search
Insert sample data with text and vector embeddings into the table. Then, find documents that are close to a specific vector embedding.
String annQuery = String.format(
"SELECT id, text, similarity_cosine(vector, [0.15, 0.1, 0.1, 0.35, 0.55]) as sim " +
"FROM vector_test " +
"ORDER BY vector ANN OF [0.15, 0.1, 0.1, 0.35, 0.55] LIMIT 2"
);
ResultSet rs = session.execute(annQuery);
for (Row row : rs) {
System.out.printf("[%d] \"%s\" (sim: %.4f)\n", row.getInt("id"), row.getString("text"), row.getFloat("sim"));
}
}
}
}
You have now added the necessary dependencies, connected the Java driver to your database, created a table with a vector-compatible SAI index, loaded sample data, and performed a similarity search to find vectors that are close to the one in your query.
Resources
See the Apache Software Foundation Java Driver documentation for details about connectivity, query execution, API conventions, and other topics.