nodetool nodesyncservice ratesimulator
Simulates rates necessary to achieve the NodeSync deadline.
Simulates rates necessary to achieve the NodeSync deadline based
on configurable assumptions. Rate simulations are useful, but in production
simulations are not a viable substitute for monitoring NodeSync and adjusting the
rate.
Restriction: Do not use this command on a keyspace with RF=1 or
on a single node cluster.
Synopsis
nodetool [connection_options] nodesyncservice ratesimulator [--deadline-overrides keyspace_name.table_name:deadline_target_time, ...] [-e keyspace_name.table_name, ...] [help] [-i keyspace_name.table_name, ...] [--ignore-replication-factor] [simulate -ds factor_integer -rs factor_integer -sg factor_integer | recommended | recommended_minimum | theoretical_minimum] [] [-v]
Syntax conventions | Description |
---|---|
UPPERCASE | Literal keyword. |
Lowercase | Not literal. |
Italics |
Variable value. Replace with a valid option or user-defined value. |
[ ] |
Optional. Square brackets ( [ ] ) surround optional command
arguments. Do not type the square brackets. |
( ) |
Group. Parentheses ( ( ) ) identify a group to choose from. Do
not type the parentheses. |
| |
Or. A vertical bar ( | ) separates alternative elements. Type
any one of the elements. Do not type the vertical bar. |
... |
Repeatable. An ellipsis ( ... ) indicates that you can repeat
the syntax element as often as required. |
'Literal string' |
Single quotation ( ' ) marks must surround literal strings in
CQL statements. Use single quotation marks to preserve upper case. |
{ key:value } |
Map collection. Braces ( { } ) enclose map collections or key
value pairs. A colon separates the key and the value. |
<datatype1,datatype2> |
Set, list, map, or tuple. Angle brackets ( < > ) enclose
data types in a set, list, map, or tuple. Separate the data types with a comma.
|
cql_statement; |
End CQL statement. A semicolon ( ; ) terminates all CQL
statements. |
[ -- ] |
Separate the command line options from the command arguments with two hyphens (
-- ). This syntax is useful when arguments might be mistaken for
command line options. |
' <schema> ... </schema>
' |
Search CQL only: Single quotation marks ( ' ) surround an entire
XML schema declaration. |
@xml_entity='xml_entity_type' |
Search CQL only: Identify the entity and literal value to overwrite the XML element in the schema and solrconfig files. |
Definition
The short form and long form parameters are comma-separated.
Connection options
- -h, --host hostname
- The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.
- -p, --port jmx_port
- The JMX port number.
- -pw, --password jmxpassword
- The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.
- -pwf, --password-file jmx_password_filepath
- The filepath to the file that stores JMX authentication credentials.
- -u, --username jmx_username
- The user name for authenticating with secure JMX.
Command arguments
- --deadline-overrides
- Allows override on the configure deadline for some/all of the tables in the simulation.
- -ds, --deadline-safety-factor
- Specify factor (integer) to decrease table deadlines to account for imperfect conditions.
- -e, --excludes keyspace_name.table_name, ...
- A comma-separated list of tables to exclude from the simulation when NodeSync is enabled on the server-side; this simulates the impact on the rate of disabling NodeSync on those tables.
- help
- Displays options and usage instructions.
- --ignore-replication-factor
- Ignores the replication factor for the simulation. Without this option, the default assumes that NodeSync runs on every node of the cluster (which is highly recommended) and assumes that validation work is spread among replicas. When NodeSync runs on every node of the cluster, each node must validate the fraction 1/RF of the data the node owns. This option removes that assumption, and computes a rate that accounts for all the data the node stores.
- -i, --includes keyspace_name.table_name, ...
- A comma-separated list of tables to include in the simulation when NodeSync is not enabled server-side; simulates the impact on the rate of enabling NodeSync on those tables.
- -rs, --rate-safety-factor factor_integer
- Represents a factor of how much to increase the final rate to account for imperfect conditions. Applies only to the simulate sub-command.
- -sg, --size-growth-factor factor_integer
- Represents a factor of how much to increase data sizes to account for data growth. Applies only to the simulate sub-command.
- -v, --verbose
- Provides details on how the simulation is carried out. Displays all steps taken by the simulation. Although this option is useful for understanding the simulations, results can be large or may be excessive if many tables exist.
Examples
Simulate rates for comments table
nodetool nodesyncservice ratesimulator -i cycling.comments
Computed rate: 420kB/s.
Simulate rates with new target times for the comments table
nodetool nodesyncservice ratesimulator --deadline-overrides cycling.comments:20h
Simulate example
- In CQL, create tables within a keyspace of RF > 1 and NodeSync enabled. For
example:
CREATE KEYSPACE cycling WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2}; USE cycling; CREATE TABLE comments (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp, PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'}; CREATE TABLE comments2 (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp, PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'};
- Insert data into the tables. For
example:
INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20-0800', 'Raining too hard should have postponed', 'Alex'); INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20.234-0800', 'Raining too hard should have postponed', 'Alex'); INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-03-21 13:11:09.999-0800', 'Second rest stop was out of water', 'Alex'); INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-04-01 06:33:02.16-0800', 'LATE RIDERS SHOULD NOT DELAY THE START', 'Alex'); INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, totimestamp(now()), 'The gift certificate for winning was the best', 'Amy'); INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-02-17 12:43:20.234+0400', 'Glad you ran the race in the rain', 'Amy'); ...
- Run the simulator:
nodetool nodesyncservice ratesimulator recommended
As expected, the computed rate is rather small because very little data was inserted.Computed rate: 16B/s.
- Run the simulator with the verbose flag to view insights on why that rate was
calculated:
nodetool nodesyncservice ratesimulator recommended -v
As expected, the computed rate is rather small because very little data was inserted.Using parameters: - Size growing factor: 1.00 - Deadline safety factor: 0.25 - Rate safety factor: 0.10 cycling.comments: - Deadline target=7.5d, adjusted from 10d for safety. - Size=1.1MB to validate (2.3MB total (adjusted from 1.1MB for future growth) but RF=2). - Added to previous tables, 1.1MB to validate in 7.5d => 2B/s => New minimum rate: 2B/s cycling.comments2: - Deadline target=7.5d, adjusted from 10d for safety. - Size=7.1MB to validate (14MB total (adjusted from 7.1MB for future growth) but RF=2). - Added to previous tables, 8.3MB to validate in 7.5d => 14B/s => New minimum rate: 14B/s Computed rate: 16B/s, adjusted from 14B/s for safety.