Configuration Profiles Overview

Purpose of Configuration Profiles

Define the required configuration profiles to prevent configuration drift for DataStax Enterprise clusters. A configuration profile enforces uniform configuration at the cluster, datacenter, or node level.

A configuration profile allows defining and centrally managing consistent configuration settings, which prevents configuration drift. Configuration drift happens over time as changes are made on a manual rather than an automated basis, and the changes are applied in an inconsistent manner. Configuration drift contributes to failures with high availability and disaster recovery efforts. If a configuration change is made outside of the Lifecycle Manager application, running a configuration job within LCM overwrites the configuration on the job targets; ensuring that the clusters, datacenters, and nodes are running as specified in the applied configuration profiles.

Inheritance and Precedence

Configuration profiles inherit intelligently within the cluster topology. For example, if a config profile is not explicitly specified at the datacenter or node level, the configuration profile is inherited from the cluster level. When creating the cluster topology model, defined configuration profiles can be applied at the cluster, datacenter, or node level. A configuration profile at the node level takes precedence over datacenter or cluster level profiles. Define configuration profiles that reflect the requirements of the workload node type in a datacenter.

When a configuration job is run, configuration profiles specified at different topology levels are merged in a granular manner. For example, consider a cluster with a config profile defined and applied at the cluster level that specifies:

use the G1 Garbage Collector (g1gc)
use a max heap size of 16 G
Does not explicitly specify a commitlog directory, instead relying on the default value of /var/lib/cassandra/commitlog.

The cluster has two datacenters; DC1 and DC2:

DC1 has no config profile of its own and therefore inherits its config profile from its cluster.
DC2 has a defined config profile at the DC level that specifies a maximum heap size of 32 G, and a commitlog directory of /cassandra_data/commitlog.

When the configuration job runs, the resulting configuration of nodes in each datacenter is as follows:

All nodes in DC1 inherit cluster settings: g1gc; 16 G max heap size; and uses the default commitlog directory of /var/lib/cassandra/commitlog commitlog directory.
Nodes in DC2 inherit from the cluster and also override cluster settings with the datacenter-level config profile: g1gc (inherited from cluster); 32 G max heap size (DC config profile takes precedence over an explicit setting in a cluster-level config profile); /cassandra_data/commitlog commitlog directory (DC config profile takes precedence takes precedence over an implicit default inherited from a cluster-level config profile).

The inheritance and precedence of config profiles allows keeping a cluster consistent by inheriting as much as possible from a cluster-level config profile, while also providing the flexibility of specifying only the granular settings that differ in higher precedence within config profiles applied at the lower, more granular datacenter and node levels.

Configuration Profile Files

Each configuration profile is specific to a recent version of DataStax Enterprise (4.7 and later). A configuration profile is composed of multiple configuration files for configuring features of DataStax Enterprise clusters:

Configuration profiles allow customizing the following configuration files:

Cassandra section:
- cassandra.yaml
- dse.yaml
- cassandra-env.sh: configure Cassandra environment settings for garbage collection; heap size
- logback.xml
Note: When you add a configuration profile, DSE authentication is enabled by default for all supported versions of DataStax Enterprise. DSE Authenticator is enabled for DSE version 5.0. For more information, see Managing DSE Security using LCM.
DSE Hadoop section:
- hive-site.xml
Spark section:
- dse-spark-env.sh
- logback-spark-executor.xml
- logback-spark-server.xml
- logback-spark.xml
- spark-defaults.conf
- spark-env.sh
Lifecycle Manager section:
- Java Setup: Automatically manages JRE installs and JCE Policy files.
- Package Proxy: Accelerate package downloads or isolate DataStax Enterprise clusters offline from the internet.

Every configuration option in cassandra.yaml and dse.yaml is editable, while other configuration files use a template system that exposes only frequently used settings. Contact DataStax Support to request additional configuration options.

A configuration file explicitly not managed at this time by Lifecycle Manager is commitlog_archiving.properties, which is used for configuring commitlog archive and PIT restore for the Backup Service. This file is managed instead from within the Backup Service.

Note: The data (cluster topology models, configuration profiles, credentials, repositories, job history, and so forth) for Lifecycle Manager is stored in the database. Your organization is responsible for backing up the lcm.db database. You must also configure failover to mirror the lcm.db.

lcm.db

The location of the Lifecycle Manager database lcm.db depends on the type of installation:

Package installations: /var/lib/opscenter/lcm.db
Tarball installations: install_location/lcm.db

Note: The data (cluster topology models, configuration profiles, credentials, repositories, job history, and so forth) for Lifecycle Manager is stored in the lcm.db database. Your organization is responsible for backing up the database. You must also configure failover to mirror the lcm.db.