DataStax Agent configuration
Configure DataStax agents with options in the address.yaml file.
The address.yaml configuration file
The address.yaml file contains configuration options for the DataStax Agent.
Most of these properties can be set in the [agent_config]
section of
cluster_name.conf on the opscenterd machine, which
automatically propagates the properties to all agents. Some properties or some cases might
require setting these properties directly in address.yaml on applicable
agents. When manually installing agents,
stomp_interface
is the only property in most environments that needs to
be explicitly configured. When automatically
installing agents, stomp_interface
is configured for you.
For more information about viewing agent status and troubleshooting agent issues, see Agents View.
address.yaml
The location of the address.yaml file depends on the type of installation:
- Package installations: /var/lib/datastax-agent/conf/address.yaml
- Tarball installations: install_location/conf/address.yaml
opscenterd.conf
The location of the opscenterd.conf file depends on the type of installation:
- Package installations: /etc/opscenter/opscenterd.conf
- Tarball installations: install_location/conf/opscenterd.conf
cluster_name.conf
The location of the cluster_name.conf file depends on the type of installation:
- Package installations: /etc/opscenter/clusters/cluster_name.conf
- Tarball installations: install_location/conf/clusters/cluster_name.conf
Configuration options
- use_ssl
- Whether or not to use SSL communication between the agent and opscenterd. Affects both the STOMP connection and agent HTTP server. Corresponds to [agents].use_ssl in opscenterd.conf. Setting this option to true turns on SSL connections. Example:
use_ssl: true
- stomp_port
- The stomp_port used by opscenterd. Example:
stomp_port: 61620
- stomp_interface
- Reachable IP address of the opscenterd machine. The connection made will be on stomp_port. Example:
stomp_interface: 127.0.0.1
- local_interface
- The IP used to identify the node. If broadcast_address is set in cassandra.yaml, this should be the same as that; otherwise, it is typically the same as listen_address in cassandra.yaml. A good check is to confirm that this address is the same as the address that nodetool ring outputs. Example:
local_interface: 172.10.0.2
- agent_rpc_interface
- The IP that the agent HTTP server listens on. In a multiple region deployment, this is typically a private IP. Default: Matches rpc_interface from cassandra.yaml. Example:
agent_rpc_interface: 172.10.0.2
- agent_rpc_broadcast_address
- The IP that the central OpsCenter process uses to connect to the DataStax agent. Default: First available resolvable address in this order: broadcast_rpc_address, rpc_address, and listen_address from cassandra.yaml. Example:
agent_rpc_broadcast_address: 172.10.0.2
- opscenter_ssl_keystore
- The SSL keystore location that the agents use to connect to opscenterd. Example:
opscenter_ssl_keystore: /etc/opscenter/conf/.keystore
- opscenter_ssl_keystore_password
- The SSL keystore password that the agents use to connect to opscenterd. Example:
opscenter_ssl_keystore_password: keystore-pass
[This field may be encrypted for additional security.] - opscenter_ssl_truststore
- The path to the truststore file that the agents use to connect to opscenterd. Example:
opscenter_ssl_truststore: /etc/opscenter/conf/.truststore
- poll_period
- The length of time, specified in seconds, between attempts to poll metrics. Example:
poll_period: 60
- disk_usage_update_period
- The length of time, in seconds, to wait between attempts to poll the disk for usage. Example:
disk_usage_update_period: 60
- rollup_rate
- Maximum number of metrics that can be saved to Cassandra over the [rollup_rate_unit] period of time.this should be at least ([#tables] * 40) + 200 per minDefault: 200 (so 200/sec with default rollup_rate_unit) Example:
rollup_rate: 200
- rollup_rate_unit
- Unit of time for rollup_rate. Choose from microsecond, millisecond, second, minute, hour, day, or month. Default: second Example:
rollup_rate_unit: second
- jmx_host
- Host used to connect to local JMX server. The default setting is localhost. This information will be sent by opscenterd for convenience, but can be configured locally as needed. Example:
jmx_host: 127.0.0.1
- jmx_port
- Port used to connect to local JMX server. The default setting is 7199. This information will be sent by opscenterd for convenience, but can be configured locally as needed. Example:
jmx_port: 7199
- jmx_user
- The username used to connect to the local JMX server. Example:
jmx_user: jmx-username
- jmx_pass
- The password used to connect to the local JMX server. Example:
jmx_pass: jmx-password
[This field may be encrypted for additional security.] - jmx_queue_poll_timeout
- The number of seconds to wait for an available JMX connection before timing out. Default: 10. Example:
jmx_queue_poll_timeout: 10
- status_reporting_interval
- The length of time, in seconds, between sending agent health information. Example:
status_reporting_interval: 20
- ec2_metadata_api_host
- The ec2 metadata api host, used to determine information about this node, if it is on ec2. Example:
ec2_metadata_api_host: 169.254.169.254
- metrics_enabled
- Whether or not to collect and store metrics for the local node. Setting this option to false turns off metrics collection. Default: true. Example:
metrics_enabled: true
- jmx_metrics_threadpool_size
- The size of the threadpool used for collecting metrics over jmx. Example:
jmx_metrics_threadpool_size: 6
- metrics_ignored_keyspaces
- A comma-separated list of tables (formerly referred to as column families) ignored by metrics collection. Example:
metrics_ignored_keyspaces: ks1, ks2, ks3
- metrics_ignored_column_families
- A comma separated list of column families that will be ignored by metric collection Example:
metrics_ignored_column_families: ks1.cf1, ks1.cf2, ks2.cf1
- metrics_ignored_solr_cores
- A comma separated list of solr cores that will be ignored by metric collection. Example:
metrics_ignored_solr_cores: ks1.cf1, ks1.cf2, ks2.cf1
- hosts
- The DataStax Enterprise node or nodes responsible for storing OpsCenter data. By default, this will be the local node, but may be configured to store data on a separate cluster. The hosts option accepts an array of strings specifying the IP addresses of the node or nodes. For example,
["1.2.3.4"]
or["1.2.3.4", "1.2.3.5"]
. Example:hosts: ["127.0.0.1"]
- cassandra_port
- Port used to connect to the storage cassandra node. The native transport port. Example:
cassandra_port: 9042
- thrift_port
- Port used to connect to storage thrift server. The default setting is 9160. This information will be sent by opscenterd for convenience, but can be configured locally as needed. Example:
thrift_port: 9160
- cassandra_user
- The Username used to connect to storage cassandra when authentication is enabled. Example:
cassandra_user: cassandra
- cassandra_pass
- The password used to connect to storage cassandra when authentication is enabled. Example:
cassandra_pass: cassandra
[This field may be encrypted for additional security.] - max_reconnect_time
- The maximum time in ms that the agent will wait between cassandra reconnect attempts. Example:
max_reconnect_time: 15000
- max_pending_repairs
- The maximum number of repairs that may be pending, exceeding this number blocks new repairs. Example:
max_pending_repairs: 5
- ssl_keystore
- The SSL keystore location for the storage cluster that agents use to connect to CQL. Example:
ssl_keystore: /etc/dse/conf/.keystore
- ssl_keystore_password
- The SSL keystore password for the storage cluster that agents use to connect to CQL. Example:
ssl_keystore_password: keystore-pass
[This field may be encrypted for additional security.] - monitored_cassandra_port
- Port used to connect to the monitored cassandra node. The native transport port. Example:
monitored_cassandra_port: 9042
- monitored_thrift_port
- Port used to connect to monitored thrift server. The default setting is 9160. This information will be sent by opscenterd for convenience, but can be configured locally as needed. Example:
monitored_thrift_port: 9160
- monitored_cassandra_user
- The Username used to connect to monitored cassandra when authentication is enabled. Example:
monitored_cassandra_user: cassandra
- monitored_cassandra_pass
- The password used to connect to monitored cassandra when authentication is enabled. Example:
monitored_cassandra_pass: cassandra-pass
[This field may be encrypted for additional security.] - monitored_ssl_keystore
- The SSL keystore location for the monitored cluster that agents use to connect to CQL. Example:
monitored_ssl_keystore: /etc/dse/conf/.keystore
- monitored_ssl_keystore_password
- The SSL keystore password for the monitored cluster that agents use to connect to CQL. Example:
monitored_ssl_keystore_password: keystore-pass
[This field may be encrypted for additional security.] - kerberos_service
- The Kerberos service name to use when using Kerberos authentication within DSE. Example:
kerberos_service: cassandra-kerberos
- kerberos_keytab_location
- The Kerberos keytab location when using Kerberos authentication within DSE. Example:
kerberos_keytab_location: /path/to/keytab.keytab
- kerberos_client_principal
- The Kerberos client principal to use when using Kerberos authentication within DSE. Example:
kerberos_client_principal: cassandra@hostname
- storage_keyspace
- The keyspace that the agent will use to store data. Example:
storage_keyspace: OpsCenter
- cassandra_install_location
- The base directory where DataStax Enterprise or Cassandra is installed. When not set, the agent attempts to auto-detect the location but cannot do so in all cases. Example:
cassandra_install_location: /usr/share/dse
- cassandra_log_location
- The directory in which DSE logs reside. This is only used for the diagnostics tarball, and should only be set if these logs are in a location other than the default. Example:
cassandra_log_location: /var/log/cassandra
- cassandra_binary_location
- The location of Cassandra's binaries’ directory (cqlsh, nodetool, and sstableloader). When not set, the agent attempts to auto-detect the location. Example:
cassandra_binary_location: /usr/bin
- cassandra_conf_location
- The location of Cassandra's configuration files’ directory (cassandra.yaml, cassandra-env.sh). When not set, the agent attempts to auto-detect the location. Example:
cassandra_conf_location: /etc/dse/cassandra
- dse_env_location
- The location of directory that holds dse-env.sh. When not set, the agent attempts to auto-detect the location. Example:
dse_env_location: /etc/dse
- dse_binary_location
- The location of directory that holds dsetool. When not set, the agent attempts to auto-detect the location. Example:
dse_binary_location: /usr/bin
- dse_conf_location
- The location of directory that holds dse.yaml. When not set, the agent attempts to auto-detect the location. Example:
dse_conf_location: /etc/dse
- spark_conf_location
- The location of directory that holds spark-env.sh. When not set, the agent attempts to auto-detect the location. Example:
spark_conf_location: /etc/dse/spark
- spark_log_location
- The location of directory that holds spark logs. When not set, the agent attempts to auto-detect the location. Example:
spark_log_location: /var/log/spark
- solr_log_location
- The location of directory that holds solr logs. When not set, the agent attempts to auto-detect the location. Example:
solr_log_location: /var/log/cassandra
- hadoop_conf_location
- The location of directory that holds hadoop-env.sh. When not set, the agent attempts to auto-detect the location. Example:
hadoop_conf_location: /etc/dse/hadoop
- hadoop_log_location
- The location of directory that holds hadoop logs. When not set, the agent attempts to auto-detect the location. Example:
hadoop_log_location: /var/log/hadoop/userlogs
- cassandra_rpc_interface
- When unspecified, the agent will attempt to determine cassandra rpc_address by reading cassandra.yaml for rpc_address. When specified, this agent lookup is skipped and the specified value is used instead. Example:
cassandra_rpc_interface: 172.10.0.2
- api_port
- The port used for the http api endpoint. Example:
api_port: 61621
- runs_sudo
- Sets whether the DataStax Agent will be run using sudo or not. Setting this option to false means the agent will not use sudo, and the agent user will not run using elevated privileges. Setting this option to true means the agent will run using sudo, and elevated privileges. Default is true. Example:
runs_sudo: true
- destinations
- Backup and restore destination definitions. Each destination is an entry in the map with the destination id as the key and a map of options specific to the type of destination. Example:
destinations: {"4798b1cdb3a145f0b4fa8ef7b3e20309" {:throttle_bytes_per_second "0", :path "s3-bucket", :server_side_encryption "False", :provider "s3", :access_key "key", :access_secret "secret"}}
- restore_req_update_period
- The frequency in seconds with which status updates are sent to opscenterd during Restore operations in the Backup Service. Default: 60. Example:
restore_req_update_period: 60
- backup_staging_dir
- The directory used for staging commit logs to be backed up. Example:
backup_staging_dir: /var/lib/datastax-agent/clogs/
- tmp_dir
- The location of the Backup Service staging directory for backups. The default location is /var/lib/datastax-agent/tmp. Example:
tmp_dir: /var/lib/datastax-agent/tmp/
- remote_backup_retries
- The number of attempts to make when file download fails during a restore. Default: 3. Example:
remote_backup_retries: 3
- remote_backup_timeout
- The timeout in milliseconds for the connection used to push backups to remote destinations. Default: 1000. Example:
remote_backup_timeout: 1000
- remote_backup_retry_delay
- The delay in milliseconds between remote backup retries. Default: 5000. Example:
remote_backup_retry_delay: 5000
- remote_verify_initial_delay
- Initial delay in milliseconds to wait before checking if a file was successfully uploaded during a backup operation. This configuration option works in conjunction with the
remote_verify_max
option to distinguish between broken versus tardy backups when cleaning up SSTables. Theremote_verify_initial_delay
value doubles each time a file transfer validation failure occurs until the value exceeds theremote_verify_max
value. Default: 1000 (1 second). Example:remote_verify_initial_delay: 1000
- remote_verify_max
- The maximum time period to wait after a file upload completed but is still unreadable from the remote destination. When this delay is exceeded, the transfer is considered failed. This configuration option works in conjunction with the
remote_verify_initial_delay
option to distinguish between broken versus tardy backups when cleaning up SSTables. Default: 30000 (30 seconds). Example:remote_verify_max: 300000
- restore_on_transfer_failure
- When set to true, a failed file transfer from the remote destination will not halt the restore process. process. A future restore attempt uses any successfully transferred files. Default: false. Example:
restore_on_transfer_failure: false
- backup_file_queue_max
- The maximum number of files that may be queued for an upload to a remote destination. Increasing this number consumes more memory. Default: 10000. Example:
backup_file_queue_max: 10000
- remote_backup_region
- The AWS region to use for remote backup transfers. Default: us-west-1. Example:
remote_backup_region: us-west-1
- max_file_transfer_attempts
- The maximum number of attempts to upload a file or create a remote destination. Default: 30. Example:
max_file_transfer_attempts: 30
- sstableloader_max_heap_size
- The maximum heap size used by the sstableloader during restore operations. Only supported with DSE 4.8.4+. Default: 256M. Example:
sstableloader_max_heap_size: 256M
- max-seconds-to-sleep
- When stream throttling is configured in Backup Service transfers to or from a remote destination, this setting acts as a cap on how long to sleep when throttling. The cap prevents prematurely closing connections due to inactivity. Default: 25 (seconds). Example:
max-seconds-to-sleep: 25
- read-buffer-size
- The buffer size to read off the disk. Increasing this number may improve transfer speed but will consume more memory. Example:
read-buffer-size: 10000000
- write-buffer-size
- The buffer size to write to the remote destination. Increasing this number may improve transfer speed, but will limit the ability of the throttler to slow transfers. Example:
write-buffer-size: 100000
- unthrottled-default
- A very large number used for bytes per second if no throttle is selected. Default: 10000000000. Example:
unthrottled-default: 10000000000
- trace_delay
- The time in milliseconds to wait between issuing a query to trace and fetching trace events in the Performance Service Slow Query panel. Default: 300. Example:
trace_delay: 300
- multipart-chunk-size
- The chunk size used for ec2 s3 file transfers in bytes. Example:
multipart-chunk-size: 5000000
- support_shell_timeout
- The number of seconds to wait for a shell process such as nodetool to run before timing out. This setting is only used for generating a diagnostic tarball. Default: 30. Example:
support_shell_timeout: 30
- graphite_host
- Setting graphite_host enables the forwarding of metrics to a graphite server at the given address. Leaving the graphite_host blank disables forwarding metrics to the graphite server. Example:
graphite_host: graphite.myhost.com
- graphite_port
- Port for graphite's plaintext protocol. Example:
graphite_port: 2003
- graphite_prefix
- A prefix to insert metrics under. Example:
graphite_prefix: opscenter
- slow_query_past
- How far into the past in milliseconds to look for slow queries. Default: 3600000 (1,000 hours). Example:
slow_query_past: 3600000
- slow_query_refresh
- Time in seconds between slow query refreshes. Default: 5. Example:
slow_query_refresh: 5
- slow_query_fetch_size
- The limit to how many slow queries are fetched. Default: 500. Example:
slow_query_fetch_size: 500
- slow_query_ignore
- A list of keyspaces that the performance service slow query log will ignore. Default: ["OpsCenter" "dse_perf"] Example:
slow_query_ignore: ["OpsCenter" "dse_perf"]
- config_encryption_active
- Specifies whether opscenter should attempt to decrypt sensitive config values. Default: False
- config_encryption_key_name
- Filename to use for the encryption key. If a custom name is not specified, opsc_system_key is used by default. Example:
config_encryption_key_name: opsc_system_key
- config_encryption_key_path
- Path where the encryption key should be located. If unspecified, the directory of address.yaml is used by default. Example:
config_encryption_key_path: /var/lib/datastax-agent/conf/
- running-request-cache-size
- Size of running requests cache Example:
running-request-cache-size: 500
- finished-request-cache-size
- Size of finished requests cache Example:
finished-request-cache-size: 100
- tcp_response_timeout
- The tcp response timeout used for JMX specified in milliseconds. Example:
tcp_response_timeout: 120000
- pong_timeout_ms
- The number of milliseconds to wait for a pong reply from opscenterd over stomp before timing out the ping. Example:
pong_timeout_ms: 5000