Setting the NodeSync rate
Estimate NodeSync rate impacts and set rates.
cassandra.yaml
The location of the cassandra.yaml file depends on the type of installation:Package installations | /etc/dse/cassandra/cassandra.yaml |
Tarball installations | installation_location/resources/cassandra/conf/cassandra.yaml |
Estimating rate setting impact
The rate_in_kb sets the per node rate of the local NodeSync service. It controls the maximum number of bytes per second used to validate data. There is a fundamental tradeoff between how fast NodeSync validates data and how many resources it consumes. The rate is a limit on the amount of resources used and a target that NodeSync tries to achieve by auto-tuning internals. The set rate might not be achieved in practice, because validation can complete at a slower rate on new or small cluster or the node might temporarily or permanently lack available resources.
Initial rate setting
There is no strong requirement to keep all nodes validating at the same rate. Some nodes will simply validate more data than others. When setting the rate, use the simplest method first by using the defaults.
- Check the rate_in_kb setting within the
nodesync
section in the cassandra.yaml file. - Try increasing or decreasing the value at run time:
nodetool nodesyncservice setrate value_in_kb_sec
- Check the configured rate.
nodetool nodesyncservice getrate
Tip: The configured rate is different from the effective rate, which can be found in the NodeSync Service metrics.
Simulating NodeSync rates
When adjusting rates, use the NodeSync rate simulator to help determine the configuration settings by computing the rate necessary to validate all tables within their allowed deadlines.- Failures - When a node fails, it does not participate in NodeSync validation while it is offline.
- Temporary overloads - During periods of overload, such as an unexpected events, nodes can not achieve the configured rate.
- Data size variation - The rate required to repair all tables within a fixed amount of time directly depends on the size of the data to validate, which is typically a moving target.