Datacenter-aware Round Robin Policy
A specialized Round Robin load balancing policy allows for querying remote datacenters only when all local nodes are down. This policy will round robin requests across hosts in the local datacenter, falling back to remote datacenter if necessary. The name of the local datacenter must be supplied by the user.
All known remote hosts will be tried when local nodes are not available. However, you can configure the exact number of remote hosts that will be used by passing that number when constructing a policy instance.
By default, this policy will not attempt to use remote hosts for local
consistencies (:local_one
or :local_quorum
), however, it is possible to
change that behavior via constructor.
Background
- Given
- a running cassandra cluster in 2 datacenters with 2 nodes in each
- And
- the following schema:
CREATE KEYSPACE simplex WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; USE simplex; CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, tags set<text>, data blob ); INSERT INTO songs (id, title, album, artist, tags) VALUES ( 756716f7-2e54-4715-9f00-91dcbea6cf50, 'La Petite Tonkinoise', 'Bye Bye Blackbird', 'Joséphine Baker', {'jazz', '2013'}) ; INSERT INTO songs (id, title, album, artist, tags) VALUES ( f6071e72-48ec-4fcb-bf3e-379c8a696488, 'Die Mösch', 'In Gold', 'Willi Ostermann', {'kölsch', '1996', 'birds'} ); INSERT INTO songs (id, title, album, artist, tags) VALUES ( fbdf82ed-0063-4796-9c7c-a3d4f47b4b25, 'Memo From Turner', 'Performance', 'Mick Jager', {'soundtrack', '1991'} );
First seen datacenter is considered local when not explicitly given
- Given
- the following example:
require 'cassandra' policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new hosts = ['127.0.0.3', '127.0.0.4'] cluster = Cassandra.cluster(hosts: hosts, load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- When
- it is executed
- Then
- its output should contain:
127.0.0.3 127.0.0.4
Requests are automatically routed to local datacenter
- Given
- the following example:
require 'cassandra' datacenter = "dc2" policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter) cluster = Cassandra.cluster(load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- When
- it is executed
- Then
- its output should contain:
127.0.0.3 127.0.0.4
Requests are routed to remote datacenters if local datacenter is down
- Given
- the following example:
require 'cassandra' datacenter = "dc2" policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter) cluster = Cassandra.cluster(consistency: :one, load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
127.0.0.1 127.0.0.2
Requests are routed up to a maximum number of hosts in remote datacenters
- Given
- the following example:
require 'cassandra' datacenter = "dc2" remotes_to_try = 1 policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, remotes_to_try) cluster = Cassandra.cluster(consistency: :one, load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts "Used #{hosts_used.size} host, with ip #{hosts_used.first}"
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should match:
Used 1 host, with ip 127\.0\.0\.(1|2)
Requests with local consistencies are not routed to remote datacenters
- Given
- the following example:
require 'cassandra' datacenter = "dc2" policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter) cluster = Cassandra.cluster(consistency: :one, load_balancing_policy: policy) session = cluster.connect('simplex') begin session.execute("SELECT * FROM songs", :consistency => :local_one) puts "failure" rescue Cassandra::Errors::NoHostsAvailable puts "success" end
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
success
Routing requests with local consistencies to remote datacenters
- Given
- the following example:
require 'cassandra' datacenter = "dc2" use_remote = true policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, nil, use_remote) cluster = Cassandra.cluster(consistency: :one, load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
127.0.0.1 127.0.0.2