nodetool nodesyncservice ratesimulator

Simulates rates necessary to achieve the NodeSync deadline based on configurable assumptions. Rate simulations are useful, but in production simulations are not a viable substitute for monitoring NodeSync and adjusting the rate.

Restriction: Do not use this command on a keyspace with RF=1 or on a single node cluster.

Synopsis

nodetool [connection_options] nodesyncservice ratesimulator
[--deadline-overrides <keyspace_name>.<table_name>:<deadline_target_time>, ...]
[-e <keyspace_name>.<table_name>, ...]
[help] [-i <keyspace_name>.<table_name>, ...]
[--ignore-replication-factor]
[simulate -ds <factor_integer> -rs <factor_integer> -sg <factor_integer> |
recommended | recommended_minimum | theoretical_minimum]
[] [-v]

Definition

The short- and long-form options are comma-separated.

Connection options

-h, --host hostname

The hostname or IP address of a remote node or nodes. When omitted, the default is the local machine.

-p, --port jmx_port

The JMX port number.

-pw, --password jmxpassword

The JMX password for authenticating with secure JMX. If a password is not provided, you are prompted to enter one.

-pwf, --password-file jmx_password_filepath

The filepath to the file that stores JMX authentication credentials.

-u, --username jmx_username

The username for authenticating with secure JMX.

Command arguments

--deadline-overrides

Allows override on the configure deadline for some/all of the tables in the simulation.

-ds, --deadline-safety-factor

Specify factor (integer) to decrease table deadlines to account for imperfect conditions.

Only for simulate sub-command.

-e, --excludes keyspace_name.table_name, …​

A comma-separated list of tables to exclude from the simulation when NodeSync is enabled on the server-side; this simulates the impact on the rate of disabling NodeSync on those tables.

help

Displays options and usage instructions.

--ignore-replication-factor

Ignores the replication factor for the simulation. Without this option, the default assumes that NodeSync runs on every node of the cluster (which is highly recommended) and assumes that validation work is spread among replicas. When NodeSync runs on every node of the cluster, each node must validate the fraction 1/RF of the data the node owns. This option removes that assumption, and computes a rate that accounts for all the data the node stores.

-i, --includes keyspace_name.table_name, …​

A comma-separated list of tables to include in the simulation when NodeSync is not enabled server-side; simulates the impact on the rate of enabling NodeSync on those tables.

-rs, --rate-safety-factor factor_integer

Represents a factor of how much to increase the final rate to account for imperfect conditions. Applies only to the simulate sub-command.

-sg, --size-growth-factor factor_integer

Represents a factor of how much to increase data sizes to account for data growth. Applies only to the simulate sub-command.

-v, --verbose

Provides details on how the simulation is carried out. Displays all steps taken by the simulation. Although this option is useful for understanding the simulations, results can be large or may be excessive if many tables exist.

Examples

Simulate rates for comments table

nodetool nodesyncservice ratesimulator -i cycling.comments
Computed rate: 420kB/s.

Simulate rates with new target times for the comments table

nodetool nodesyncservice ratesimulator --deadline-overrides cycling.comments:20h

Simulate example

  1. In CQL, create tables within a keyspace of RF > 1 and NodeSync enabled. For example:

    CREATE KEYSPACE cycling WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 2};
    USE cycling;
    CREATE TABLE comments (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp,
      PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'};
    CREATE TABLE comments2 (record_id timeuuid, id uuid, commenter text, comment text, created_at timestamp,
      PRIMARY KEY (id, created_at)) WITH nodesync={'enabled': 'true'};
  2. Insert data into the tables. For example:

    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20-0800', 'Raining too hard should have postponed', 'Alex');
    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-02-14 12:43:20.234-0800', 'Raining too hard should have postponed', 'Alex');
    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-03-21 13:11:09.999-0800', 'Second rest stop was out of water', 'Alex');
    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), e7ae5cf3-d358-4d99-b900-85902fda9bb0, '2017-04-01 06:33:02.16-0800', 'LATE RIDERS SHOULD NOT DELAY THE START', 'Alex');
    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, totimestamp(now()), 'The gift certificate for winning was the best', 'Amy');
    INSERT INTO cycling.comments (record_id, id , created_at , comment, commenter ) values (now(), c7fceba0-c141-4207-9494-a29f9809de6f, '2017-02-17 12:43:20.234+0400', 'Glad you ran the race in the rain', 'Amy');
    ...
  3. Run the simulator:

    nodetool nodesyncservice ratesimulator recommended

    As expected, the computed rate is rather small because very little data was inserted.

  4. Run the simulator with the verbose flag to view insights on why that rate was calculated:

    nodetool nodesyncservice ratesimulator recommended -v
    Using parameters:
     - Size growing factor:    1.00
     - Deadline safety factor: 0.25
     - Rate safety factor:     0.10
    
    cycling.comments:
      - Deadline target=7.5d, adjusted from 10d for safety.
      - Size=1.1MB to validate (2.3MB total (adjusted from 1.1MB for future growth) but RF=2).
      - Added to previous tables, 1.1MB to validate in 7.5d => 2B/s
      => New minimum rate: 2B/s
    cycling.comments2:
      - Deadline target=7.5d, adjusted from 10d for safety.
      - Size=7.1MB to validate (14MB total (adjusted from 7.1MB for future growth) but RF=2).
      - Added to previous tables, 8.3MB to validate in 7.5d => 14B/s
      => New minimum rate: 14B/s
    
    Computed rate: 16B/s, adjusted from 14B/s for safety.

    As expected, the computed rate is rather small because very little data was inserted.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com