Load balancing with DataStax drivers
The DataStax drivers control the distribution of the incoming load across the cluster. The load balancing policy determines the node in the cluster to be the coordinator for executing a given query. Determine the load balancing policy during application development because the policy determines the nodes that will have connection pools created and maintained by the driver. For most deployments, use the default load balancing policy.
Token awareness is common across all drivers. Token awareness uses the primary key information for a given query and parameters to retrieve the replica nodes. By selecting replicas, this policy guarantees that the selected coordinator for the query owns the data that will be written or retrieved, thereby avoiding an extra network connection on the server side.
The key is automatically calculated for prepared statement executions to obtain accurate query routing.
In some use cases, application requests should be limited to a given datacenter to ensure the data is returned to the user as efficiently as possible.
In a global application, users in North America should have their requests directed to a datacenter in North America. Users in Europe should have their requests routed to a datacenter in Europe. To accomplish this, specify a local datacenter in the load balancing policy so that the driver routes this query more efficiently.
If the requests to the local datacenter do not succeed, many of the drivers support using remote datacenter hosts for queries. Though this may appear to be a way to enact datacenter failover, this feature often leads to unexpected latencies and behaviors in the application. For a detailed explanation, see this "Designing Fault Tolerant Applications with DataStax and Apache Cassandra" white paper.
The DataStax drivers integrate the best practices of token awareness and datacenter awareness into the default load balancing policy. Specifically, the default policy will retrieve the replicas for a given token and return a list of hosts containing the replicas in the local datacenter first, followed by the rest of nodes in the specified local datacenter. Using a load distributing algorithm, the default load balancing policy fairly distributes the load across the replica nodes.
If custom routing and load balancing are required in an application, the existing load balancing interface can be extended. Custom load balancing is provided through whitelist and blacklist load balancing policies. Refer to the individual driver documentation for information on whitelist and blacklist load balancing policies. Customizing the load balancing policy is an advanced topic. Study the existing policies before implementing a custom load balancing policy.