OpsCenter can manage clusters containing multiple hundreds of nodes. When managing
very large clusters with up to 1000 nodes, adjusting cluster configuration settings improves
performance.
OpsCenter can manage very large clusters up to 1000 nodes.
Note: Lifecycle Manager can
provision and manage up to 300 nodes per cluster within its UI. See
Supported capabilities
for more details.
When working with very large clusters, the performance of OpsCenter decreases with
the default settings. To improve performance, adjust the cluster settings to
increase the time period between polls of a cluster's nodes and token lists.
After adding a very large cluster to OpsCenter, change the following default
settings:
opscenterd.conf
The location of
the
opscenterd.conf file depends on the type of
installation:
- Package installations:
/etc/opscenter/opscenterd.conf
- Tarball installations:
install_location/conf/opscenterd.conf
cluster_name.conf
The location
of the
cluster_name.conf file depends on the
type of installation:
- Package installations:
/etc/opscenter/clusters/cluster_name.conf
- Tarball installations:
install_location/conf/clusters/cluster_name.conf
Procedure
-
Open for editing.
-
Increase the node list poll period to 30 minutes by setting the
nodelist_poll_period
option to 1800 under
[collection]
:
- [collection] nodelist_poll_period
- The interval in seconds OpsCenter waits to poll the nodes in a cluster. The default value is 30.
[collection]
nodelist_poll_period = 1800
-
If an agent is overloaded, increase the default
http_timeout
if necessary:
- [agents] http_timeout
- The timeout, in seconds, for an HTTP call to the agent. The default value is 10.
[agents]
http_timeout = 20
-
Open for editing and adjust
the following settings:
- [agents] not_seen_threshold
- The maximum time in seconds since the last agent status about a specific connection, such as stomp, was sent before that agent connection is considered down. This threshold also affects how long OpsCenter waits before marking node health as unknown. Default value: 180 seconds.
- [agents] http_poll_period
- The frequency in seconds between attempts to poll agent http health. Default value: 60 seconds.
- [ui] default_api_timeout
- The default timeout value in seconds for an API call from the OpsCenter UI to the OpsCenter API. The default value is 10. Some API calls require a timeout longer than 10 seconds. In those cases, the API call timeouts are scaled relative to the default_api_timeout (for example, 6 * default_api_timeout). Changing the default_api_timeout affects those timeouts accordingly.
[agents]
not_seen_threshold = 620
http_poll_period = 500
[ui]
default_api_timeout = 60
-
Use the environmental variable
OPSC_JVM_OPTS
to override the
default parameters for the OpsCenter JVM.
The following command doubles the heap size to 4096m
(4GB).
export OPSC_JVM_OPTS=-Xmx4096m
See Configuring the OpsCenter JVM for additional information.
- Optional:
If you continually receive
OutOfMemory
errors, consider Configuring the DataStax Agent JVM.
-
Restart OpsCenter.