Manual

Quick start

Here’s a short program that connects to DSE and executes a CQL query:

Copy
import com.datastax.driver.dse.DseCluster;
import com.datastax.driver.dse.DseSession;

DseCluster cluster = null;
try {
    cluster = DseCluster.builder()                                               // (1)
            .addContactPoint("127.0.0.1")
            .build();
    DseSession session = cluster.connect();                                      // (2)

    Row row = session.execute("select release_version from system.local").one(); // (3)
    System.out.println(row.getString("release_version"));                        // (4)
} finally {
    if (cluster != null) cluster.close();                                        // (5)
}
  1. the DseCluster object is the main entry point of the driver. It holds the known state of the actual DSE cluster (notably the Metadata). This class is thread-safe, you should create a single instance (per target DSE cluster), and share it throughout your application;
  2. the DseSession is what you use to execute queries. Likewise, it is thread-safe and should be reused;
  3. we use execute to send a query to DSE. This returns a ResultSet, which is essentially a collection of Row objects. On the next line, we extract the first row (which is the only one in this case);
  4. we extract the value of the first (and only) column from the row;
  5. finally, we close the cluster after we’re done with it. This will also close any session that was created from this cluster. This step is important because it frees underlying resources (TCP connections, thread pools…). In a real application, you would typically do this at shutdown (for example, when undeploying your webapp).

Note: this example uses the synchronous API. Most methods have asynchronous equivalents.

For users familiar with the DataStax driver for Cassandra, DseCluster and DseSession wrap their pure-CQL equivalents. You can use DseSession as a drop-in replacement for Session. In fact the DSE driver also includes these classes, but you’ll typically use a DseCluster to interact with DSE.

Troubleshooting connection issues

If the example above fails to connect (throwing NoHostAvailableException), check that the contact points are accessible from the client machine, for example:

telnet 1.2.3.4 9042

Here are some common mistakes:

  • using the wrong address. See Address resolution for an explanation of the various addresses that can be configured server-side, and which one should be used by the client.
  • using the wrong port. The value passed to DseCluster.builder().withPort() (default 9042) should match native_transport_port in cassandra.yaml.
  • on older server versions, the native transport must be enabled explicitly. See start_native_transport in cassandra.yaml.

Setting up the driver

DseCluster

Creating an instance

DseCluster.Builder provides a fluent API:

Copy
DseCluster cluster = DseCluster.builder()
        .withClusterName("myCluster")
        .addContactPoint("127.0.0.1")
        .build();
Creation options

The only required option is the list of contact points, i.e. the hosts that the driver will initially contact to discover the cluster topology. You can provide a single contact point, but it is usually a good idea to provide more, so that the driver can fallback if the first one is down.

The other aspects that you can configure on the DseCluster are:

In addition, you can register various types of listeners to be notified of cluster events; see Host.StateListener, LatencyTracker, and SchemaChangeListener.

Cluster initialization

A freshly-built DseCluster instance does not initialize automatically; that will be triggered by one of the following actions:

  • an explicit call to cluster.init();
  • a call to cluster.getMetadata();
  • creating a session with cluster.connect() or one of its variants;
  • calling session.init() on a session that was created with cluster.newSession().

The initialization sequence is the following:

  • initialize internal state (thread pools, utility components, etc.);
  • try to connect to each of the contact points in sequence. The order is not deterministic (in fact, the driver shuffles the list to avoid hotspots if a large number of clients share the same contact points). If no contact point replies, a NoHostAvailableException is thrown and the process stops here;
  • otherwise, the successful contact point is elected as the control host. The driver negotiates the native protocol version with it, and queries its system tables to discover the addresses of the other hosts.

Note that, at this stage, only the control connection has been established. Connections to other hosts will only be opened when a session gets created.

DseSession

By default, a session isn’t tied to any specific keyspace. You’ll need to prefix table names in your queries:

Copy
DseSession session = cluster.connect();
session.execute("select * from myKeyspace.myTable where id = 1");

You can also specify a keyspace name at construction time, it will be used as the default when table names are not qualified:

Copy
DseSession session = cluster.connect("myKeyspace");
session.execute("select * from myTable where id = 1");
session.execute("select * from otherKeyspace.otherTable where id = 1");

You might be tempted to open a separate session for each keyspace used in your application; however, note that connection pools are created at the session level, so each new session will consume additional system resources:

Copy
// Warning: creating two sessions doubles the number of TCP connections opened by the driver
DseSession session1 = cluster.connect("ks1");
DseSession session2 = cluster.connect("ks2");

Finally, if you issue a USE statement, it will change the default keyspace on that session:

Copy
DseSession session = cluster.connect();
// No default keyspace set, need to prefix:
session.execute("select * from myKeyspace.myTable where id = 1");

session.execute("USE myKeyspace");
// Now the keyspace is set, unqualified query works:
session.execute("select * from myTable where id = 1");

Be very careful though: if the session is shared by multiple threads, switching the keyspace at runtime could easily cause unexpected query failures.

Running queries

You run queries with the session’s execute method:

Copy
ResultSet rs = session.execute("select release_version from system.local");

As shown here, the simplest form is to pass a query string directly. You can also pass an instance of Statement.

Processing rows

Executing a query produces a ResultSet, which is an iterable of Row. The basic way to process all rows is to use Java’s for-each loop:

Copy
for (Row row : rs) {
    // process the row
}

Note that this will return all results without limit (even though the driver might use multiple queries in the background). To handle large result sets, you might want to use a LIMIT clause in your CQL query, or use one of the techniques described in the paging documentation.

When you know that there is only one row (or are only interested in the first one), the driver provides a convenience method:

Copy
Row row = rs.one();

Reading columns

Row provides getters to extract column values; they can be either positional or named:

Copy
Row row = session.execute("select first_name, last_name from users where id = 1").one();

// The two are equivalent:
String firstName = row.getString(0);
String firstName = row.getString("first_name");
CQL to Java type mapping
CQL3 data type Getter name Java type
ascii getString java.lang.String
bigint getLong long
blob getBytes java.nio.ByteBuffer
boolean getBool boolean
counter getLong long
date getDate LocalDate
decimal getDecimal java.math.BigDecimal
double getDouble double
float getFloat float
inet getInet java.net.InetAddress
int getInt int
list getList java.util.List
map getMap java.util.Map
set getSet java.util.Set
smallint getShort short
text getString java.lang.String
time getTime long
timestamp getTimestamp java.util.Date
timeuuid getUUID java.util.UUID
tinyint getByte byte
tuple getTupleValue TupleValue
user-defined types getUDTValue UDTValue
uuid getUUID java.util.UUID
varchar getString java.lang.String
varint getVarint java.math.BigInteger

In addition to these default mappings, you can register your own types with custom codecs.

Primitive types

For performance reasons, the driver uses primitive Java types wherever possible (boolean, int…); the CQL value NULL is encoded as the type’s default value (false, 0…), which can be ambiguous. To distinguish NULL from actual values, use isNull:

Copy
Integer age = row.isNull("age") ? null : row.getInt("age");
Collection types

To ensure type safety, collection getters are generic. You need to provide type parameters matching your CQL type when calling the methods:

Copy
// Assuming given_names is a list<text>:
List<String> givenNames = row.getList("given_names", String.class);

For nested collections, element types are generic and cannot be expressed as Java Class instances. We use Guava’s TypeToken instead:

Copy
// Assuming teams is a set<list<text>>:
TypeToken<List<String>> listOfStrings = new TypeToken<List<String>>() {};
Set<List<String>> teams = row.getSet("teams", listOfStrings);

Since type tokens are anonymous inner classes, it’s recommended to store them as constants in a utility class instead of re-creating them each time.

Row metadata

Row exposes an API to explore the column metadata at runtime:

Copy
for (ColumnDefinitions.Definition definition : row.getColumnDefinitions()) {
    System.out.printf("Column %s has type %s%n",
            definition.getName(),
            definition.getType());
}

Object mapping

Besides explicit work with queries and rows, you can also use Object Mapper to simplify retrieval & store of your data.