nodetool nodesyncservice ratesimulator

Simulates rates necessary to achieve the NodeSync deadline based on configurable assumptions. Rate simulations are useful, but in production simulations are not a viable substitute for monitoring NodeSync and adjusting the rate.

Restriction: Do not use this command on a keyspace with RF=1 or on a single node cluster.

Synopsis

nodetool [connection_options] nodesyncservice ratesimulator
[--deadline-overrides <keyspace_name>.<table_name>:<deadline_target_time>, ...]
[-e <keyspace_name>.<table_name>, ...]
[help] [-i <keyspace_name>.<table_name>, ...]
[--ignore-replication-factor]
[simulate -ds <factor_integer> -rs <factor_integer> -sg <factor_integer> |
recommended | recommended_minimum | theoretical_minimum]
[] [-v]

Options

If an option has a short and long form, both forms are given, separated by a comma.

-h, --host hostname: The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
-p, --port jmx_port: The JMX port number.
-pw, --password jmxpassword: The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
-pwf, --password-file jmx_password_filepath: The filepath to the file that stores JMX authentication credentials.
-u, --username jmx_username: The username for authenticating with secure JMX.
--deadline-overrides: Allows override on the configure deadline for some/all of the tables in the simulation.
-ds, --deadline-safety-factor: Specify factor (integer) to decrease table deadlines to account for imperfect conditions.

Only for simulate sub-command.
-e, --excludes keyspace_name.table_name, …: A comma-separated list of tables to exclude from the simulation when NodeSync is enabled on the server-side; this simulates the impact on the rate of disabling NodeSync on those tables.
help: Displays options and usage instructions.
--ignore-replication-factor: Ignores the replication factor for the simulation. Without this option, the default assumes that NodeSync runs on every node of the cluster (which is highly recommended) and assumes that validation work is spread among replicas. When NodeSync runs on every node of the cluster, each node must validate the fraction 1/RF of the data the node owns. This option removes that assumption, and computes a rate that accounts for all the data the node stores.
-i, --includes keyspace_name.table_name, …: A comma-separated list of tables to include in the simulation when NodeSync is not enabled server-side; simulates the impact on the rate of enabling NodeSync on those tables.
-rs, --rate-safety-factor factor_integer: Represents a factor of how much to increase the final rate to account for imperfect conditions. Applies only to the simulate sub-command.
-sg, --size-growth-factor factor_integer: Represents a factor of how much to increase data sizes to account for data growth. Applies only to the simulate sub-command.
-v, --verbose: Provides details on how the simulation is carried out. Displays all steps taken by the simulation. Although this option is useful for understanding the simulations, results can be large or may be excessive if many tables exist.

Examples

Simulate rates for comments table

nodetool nodesyncservice ratesimulator -i cycling.comments

Computed rate: 420kB/s.

Simulate rates with new target times for the comments table

nodetool nodesyncservice ratesimulator --deadline-overrides cycling.comments:20h

Simulate example

In CQL, create tables within a keyspace of RF > 1 and NodeSync enabled. For example:

CREATE KEYSPACE cycling WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2};
USE cycling;
CREATE TABLE comments (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp,
  PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'};
CREATE TABLE comments2 (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp,
  PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'};

Insert data into the tables. For example:

INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20-0800', 'Raining too hard should have postponed', 'Alex');
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20.234-0800', 'Raining too hard should have postponed', 'Alex');
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-03-21 13:11:09.999-0800', 'Second rest stop was out of water', 'Alex');
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-04-01 06:33:02.16-0800', 'LATE RIDERS SHOULD NOT DELAY THE START', 'Alex');
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, totimestamp(now()), 'The gift certificate for winning was the best', 'Amy');
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-02-17 12:43:20.234+0400', 'Glad you ran the race in the rain', 'Amy');
...

Run the simulator:
```
nodetool nodesyncservice ratesimulator recommended
```
As expected, the computed rate is rather small because very little data was inserted.

Run the simulator with the verbose flag to view insights on why that rate was calculated:

nodetool nodesyncservice ratesimulator recommended -v

Using parameters:
 - Size growing factor:    1.00
 - Deadline safety factor: 0.25
 - Rate safety factor:     0.10

cycling.comments:
  - Deadline target=7.5d, adjusted from 10d for safety.
  - Size=1.1MB to validate (2.3MB total (adjusted from 1.1MB for future growth) but RF=2).
  - Added to previous tables, 1.1MB to validate in 7.5d => 2B/s
  => New minimum rate: 2B/s
cycling.comments2:
  - Deadline target=7.5d, adjusted from 10d for safety.
  - Size=7.1MB to validate (14MB total (adjusted from 7.1MB for future growth) but RF=2).
  - Added to previous tables, 8.3MB to validate in 7.5d => 14B/s
  => New minimum rate: 14B/s

Computed rate: 16B/s, adjusted from 14B/s for safety.

As expected, the computed rate is rather small because very little data was inserted.