Load balancing with DSE drivers

The DataStax drivers control the distribution of the incoming load across the DSE cluster.

The DataStax drivers control the distribution of the incoming load across the DSE cluster. The load balancing policy determines the node in the cluster to be the coordinator for executing a given query. Determine the load balancing policy during application development because the policy determines the nodes that will have connection pools created and maintained by the driver. For most deployments, use the default load balancing policy.

Table 1. Load balancing policy configuration
C/C++ C# Java Node.js PHP Python Ruby

Coordinator selection

Each time a query is executed, the load balancing policy returns a query plan that determines which hosts are eligible to receive the query. The driver uses the first host on the list to execute the request, leaving the successive hosts for retry and speculative execution.

Token awareness

Token awareness is common across all drivers. Token awareness uses the primary key information for a given query and parameters to retrieve the replica nodes. By selecting replicas, this policy guarantees that the selected coordinator for the query owns the data that will be written or retrieved, thereby avoiding an extra network connection on the server side.

The key is automatically calculated for prepared statement executions to obtain accurate query routing.

Datacenter awareness

In some use cases, application requests should be limited to a given datacenter to ensure the data is returned to the user as efficiently as possible.

In a global application, users in North America should have their requests directed to a datacenter in North America. Users in Europe should have their requests routed to a datacenter in Europe. To accomplish this, specify a local datacenter in the load balancing policy so that the driver routes this query more efficiently.

If the requests to the local datacenter do not succeed, many of the drivers support using remote datacenter hosts for queries. Though this may appear to be a way to enact datacenter failover, this feature often leads to unexpected latencies and behaviors in the application. For a detailed explanation, see this "Cassandra: Local_Quorum Should Stay Local" blog post.

Default load balancing policy

The DataStax drivers integrate the best practices of token awareness and datacenter awareness into the default load balancing policy. Specifically, the default policy will retrieve the replicas for a given token and return a list of hosts containing the replicas in the local datacenter first, followed by the rest of nodes in the specified local datacenter. Using a load distributing algorithm, the default load balancing policy fairly distributes the load across the replica nodes.

Customizing load balancing

If custom routing and load balancing are required in an application, the existing load balancing interface can be extended. Custom load balancing is provided through whitelist and blacklist load balancing policies. Refer to the individual driver documentation for information on whitelist and blacklist load balancing policies. Customizing the load balancing policy is an advanced topic. Study the existing policies before implementing a custom load balancing policy.