Resolving query timeouts on restarted nodes
When restarting nodes with large indexes (hundreds of megabytes), initial queries might timeout due to the time it takes to build the token range filter queries.
Procedure
-
To workaround timeouts:
-
Run with a replication factor greater than 1 to ensure that replicas are always available.
-
Configure the
dse.yaml
settings for enable_health_based_routing and uptime_ramp_up_period_seconds to be larger than the amount of time it takes for the first query to answer. 1 hour is usually enough. -
After restarting the node, issue several match all queries. For example,
q=:
to warm up the filters. -
If you’re using the Java Driver, create an ad-hoc session with only the node to warm up in the white list.
Issuing many queries increase the chances that all token ranges are used.
Results
After the uptime ramp-up period, the node starts to be hit by distributed queries. The filters are warmed up already and timeouts should not occur.