Configuration profiles overview

Define the required configuration profiles to prevent configuration drift for DataStax Enterprise clusters. A configuration profile enforces uniform configuration at the cluster, datacenter, or node level.

Purpose of Configuration Profiles 

Define the required configuration profiles to prevent configuration drift for DataStax Enterprise clusters. A configuration profile enforces uniform configuration at the cluster, datacenter, or node level.

A configuration profile allows defining and centrally managing consistent configuration settings, which prevents configuration drift. Configuration drift happens over time as changes are made on a manual rather than an automated basis, and the changes are applied in an inconsistent manner. Configuration drift contributes to failures with high availability and disaster recovery efforts. If a configuration change is made outside of the Lifecycle Manager application (not recommended), running a configuration job within LCM overwrites the configuration on the job targets; ensuring that the clusters, datacenters, and nodes are running as specified in the applied configuration profiles.

Inheritance and Precedence 

Configuration profiles inherit intelligently within the cluster topology. For example, if a config profile is not explicitly specified at the datacenter or node level, the configuration profile is inherited from the cluster level. When creating the cluster topology model, configuration profiles for initial DSE install jobs or subsequent configure jobs can be applied at the cluster, datacenter, or node level. Config profiles for upgrade jobs are applied only at the datacenter or node level. A configuration profile at the node level takes precedence over datacenter or cluster level profiles. Define configuration profiles that reflect the requirements of the workload node type in a datacenter.

When a configuration job is run, configuration profiles specified at different topology levels are merged in a granular manner. For example, consider a cluster with a config profile defined and applied at the cluster level that specifies:
  • use the G1 Garbage Collector (g1gc)
  • use a max heap size of 16 G
  • does not explicitly specify a commit log directory, instead relying on the default value of /var/lib/cassandra/commitlog.
The cluster has two datacenters; DC1 and DC2:
  • DC1 has no config profile of its own and therefore inherits its config profile from its cluster.
  • DC2 has a defined config profile at the DC level that specifies a maximum heap size of 32 G, and a commit log directory of /cassandra_data/commitlog.
When the configuration job runs, the resulting configuration of nodes in each datacenter is as follows:
  • All nodes in DC1 inherit cluster settings: g1gc; 16 G max heap size; and uses the default commit log directory of /var/lib/cassandra/commitlog commitlog directory.
  • Nodes in DC2 inherit from the cluster and also override cluster settings with the datacenter-level config profile: g1gc (inherited from cluster); 32 G max heap size (DC config profile takes precedence over an explicit setting in a cluster-level config profile); /cassandra_data/commitlog commit log directory (DC config profile takes precedence takes precedence over an implicit default inherited from a cluster-level config profile).
The inheritance and precedence of config profiles allows keeping a cluster consistent by inheriting as much as possible from a cluster-level config profile, while also providing the flexibility of specifying only the granular settings that differ in higher precedence within config profiles applied at the lower, more granular datacenter and node levels.
Important: For upgrade jobs, configuration profiles do not inherit settings across DSE versions. Inheritance from a parent level is prevented in an LCM cluster that is comprised of multiple DSE versions. For example, if a config profile is set at the cluster level with dse-version=5.0.2 and max-heap-size=16GB, and another config profile is set at the datacenter level of the cluster with dse-version=5.0.3 and no explicit max-heap-size specified, the nodes in the affected datacenter get the default value of max-heap-size for 5.0.3. The nodes do not inherit the max-heap-size setting from the cluster config profile during an upgrade job as is the case for install or config jobs. For upgrade jobs, the config profile at the lowest topology level takes precedence and a warning is posted in the job events.

Configuration Profile Files 

Each configuration profile is specific to a recent supported version of DataStax Enterprise (5.0 and later). A configuration profile is composed of multiple configuration files for configuring features of DataStax Enterprise clusters:

LCM Config Profile config files for DSE

Configuration profiles allow customizing settings for the following configuration files:
  • Cassandra section:
    Note: When you add a configuration profile, DSE authentication is enabled by default for all supported versions of DataStax Enterprise. DSE Authenticator is enabled in dse.yaml for DSE versions 5.0 and later. For more information, see Managing DSE Security using LCM.

    Every configuration option in cassandra.yaml and dse.yaml is editable, while other configuration files use a template system that exposes only frequently used settings. Contact DataStax Support to request additional configuration options.

  • Spark section:
    • logback-sparkR.xml
    • dse-spark-env.sh
    • logback-spark.xml
    • spark-defaults.conf
    • spark-env.sh
    • logback-spark-executor.xml
    • logback-spark-server.xml
    • hive-site.xml
    • spark-daemon-defaults.conf

    For more information about Spark, see Configuring Spark and Configuring Spark logging options.

  • Lifecycle Manager section:
    • Package Proxy: Accelerate package downloads or isolate DataStax Enterprise clusters offline from the internet.
    • Java Setup: Automatically manages JRE installs and JCE Policy files.
    Note: A configuration file explicitly not managed at this time by Lifecycle Manager is commitlog_archiving.properties, which is used for configuring commit log archive and PIT restore for the Backup Service. This file is managed instead from within the Backup Service.
Note: The data (cluster topology models, configuration profiles, credentials, repositories, job history, and so forth) for Lifecycle Manager is stored in the lcm.db database. Your organization is responsible for backing up the lcm.db database. You must also configure failover to mirror the lcm.db.