Load balancing with DataStax drivers

The DataStax drivers control the distribution of the incoming load across the cluster. The load balancing policy determines the node in the cluster to be the coordinator for executing a given query. Determine the load balancing policy during application development because the policy determines the nodes that will have connection pools created and maintained by the driver. For most deployments, use the default load balancing policy.

Load balancing policy configuration

C/C++

C#

Java

Node.js

PHP

Python

Ruby

Coordinator selection

Each time a query is executed, the load balancing policy returns a query plan that determines which hosts are eligible to receive the query. The driver uses the first host on the list to execute the request, leaving the successive hosts for retry and speculative execution.

Token awareness

Token awareness is common across all drivers. Token awareness uses the primary key information for a given query and parameters to retrieve the replica nodes. By selecting replicas, this policy guarantees that the selected coordinator for the query owns the data that will be written or retrieved, thereby avoiding an extra network connection on the server side.

The key is automatically calculated for prepared statement executions to obtain accurate query routing.

Datacenter awareness

In some use cases, application requests should be limited to a given datacenter to ensure the data is returned to the user as efficiently as possible.

In a global application, users in North America should have their requests directed to a datacenter in North America. Users in Europe should have their requests routed to a datacenter in Europe. To accomplish this, specify a local datacenter in the load balancing policy so that the driver routes this query more efficiently.

If the requests to the local datacenter do not succeed, many of the drivers support using remote datacenter hosts for queries. Though this may appear to be a way to enact datacenter failover, this feature often leads to unexpected latencies and behaviors in the application. For a detailed explanation, see this "Designing Fault Tolerant Applications with DataStax and Apache Cassandra" white paper.

Default load balancing policy

The DataStax drivers integrate the best practices of token awareness and datacenter awareness into the default load balancing policy. Specifically, the default policy will retrieve the replicas for a given token and return a list of hosts containing the replicas in the local datacenter first, followed by the rest of nodes in the specified local datacenter. Using a load distributing algorithm, the default load balancing policy fairly distributes the load across the replica nodes.

Customizing load balancing

If custom routing and load balancing are required in an application, the existing load balancing interface can be extended. Custom load balancing is provided through whitelist and blacklist load balancing policies. Refer to the individual driver documentation for information on whitelist and blacklist load balancing policies. Customizing the load balancing policy is an advanced topic. Study the existing policies before implementing a custom load balancing policy.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com