Configuration profiles overview

Define the required configuration profiles to prevent configuration drift for DataStax Enterprise clusters. A configuration profile enforces uniform configuration at the cluster, datacenter, or node level.

cassandra-env.sh

The location of the cassandra-env.sh file depends on the type of installation:

Package installations
Installer-Services installations

/etc/dse/cassandra/cassandra-env.sh

Tarball installations
Installer-No Services installations

installation_location/resources/cassandra/conf/cassandra-env.sh
  • The cassandra-env.sh file is located in the installation_location/conf directory.

Purpose of Configuration Profiles

Define the required configuration profiles to prevent configuration drift for DataStax Enterprise (DSE) clusters. A configuration profile enforces uniform configuration at the cluster, datacenter, or node level.

A configuration profile allows defining and centrally managing consistent configuration settings, which prevents configuration drift. Configuration drift happens over time as changes are made on a manual rather than an automated basis, and the changes are applied in an inconsistent manner. Configuration drift contributes to failures with high availability and disaster recovery efforts. If a configuration change is made outside of the Lifecycle Manager application, running a configuration job within LCM overwrites the configuration on the job targets; ensuring that the clusters, datacenters, and nodes are running as specified in the applied configuration profiles.

Inheritance and Precedence

Configuration profiles inherit intelligently within the cluster topology. For example, if a configuration profile is not explicitly specified at the datacenter or node level, the configuration profile is inherited from the cluster level. When creating the cluster topology model, defined configuration profiles can be applied at the cluster, datacenter, or node level. A configuration profile at the node level takes precedence over datacenter or cluster level profiles. Define configuration profiles that reflect the requirements of the workload node type in a datacenter.

When a configuration job is run, configuration profiles specified at different topology levels are merged in a granular manner. For example, consider a cluster with a configuration profile defined and applied at the cluster level that specifies:
  • Use the G1 Garbage Collector (g1gc)
  • Use a max heap size of 16 GB
  • Does not explicitly specify a commit log directory, instead relying on the default value of /var/lib/cassandra/commitlog.
The cluster has two datacenters; DC1 and DC2:
  • DC1 has no configuration profile of its own and therefore inherits its configuration profile from its cluster.
  • DC2 has a defined configuration profile at the DC level that specifies a maximum heap size of 32 GB, and a commit log directory of /cassandra_data/commitlog.
When the configuration job runs, the resulting configuration of nodes in each datacenter is as follows:
  • All nodes in DC1 inherit cluster settings: g1gc; 16 GB max heap size; and uses the default commit log directory of /var/lib/cassandra/commitlog commitlog directory.
  • Nodes in DC2 inherit from the cluster and also override cluster settings with the datacenter-level configuration profile: g1gc (inherited from cluster); 32 GB max heap size (DC configuration profile takes precedence over an explicit setting in a cluster-level configuration profile); /cassandra_data/commitlog commit log directory (DC configuration profile takes precedence takes precedence over an implicit default inherited from a cluster-level configuration profile).

The inheritance and precedence of configuration profiles allows keeping a cluster consistent by inheriting as much as possible from a cluster-level configuration profile, while also providing the flexibility of specifying only the granular settings that differ in higher precedence within configuration profiles applied at the lower, more granular datacenter and node levels.

Configuration Profile Files

Each configuration profile is specific to a recent version of DSE. A configuration profile is composed of multiple configuration files for configuring features of DSE clusters:

LCM Config Profile config files for DSE

Configuration profiles allow customizing settings for the following configuration files:
  • Cassandra section:
    Note: When you add a configuration profile, DSE authentication is enabled by default for all supported versions of DSE. DSE Authenticator is enabled in dse.yaml for DSE version 5.0 and later. For more information, see Managing DSE Security using LCM.

    Every configuration option in cassandra.yaml and dse.yaml is editable, while other configuration files use a template system that exposes only frequently used settings. Contact DataStax Support to request additional configuration options.

  • Spark section:
    • logback-sparkR.xml
    • dse-spark-env.sh
    • logback-spark.xml
    • spark-defaults.conf
    • spark-env.sh
    • logback-spark-executor.xml
    • logback-spark-server.xml
    • hive-site.xml
    • spark-daemon-defaults.conf

    For more information, see configuring Spark for DSE and configuring Spark logging options in the DSE Administrator documentation.

  • Lifecycle Manager section:
    • Package Proxy: Accelerate package downloads or isolate DataStax Enterprise clusters offline from the internet.
    • Java Setup: Automatically manages JRE installs and JCE Policy files.
    Note: A configuration file explicitly not managed at this time by Lifecycle Manager is commitlog_archiving.properties, which is used for configuring commit log archive and PIT restore for the Backup Service. This file is managed instead from within the Backup Service.
Note: The data (cluster topology models, configuration profiles, credentials, repositories, job history, and so forth) for Lifecycle Manager is stored in the lcm.db database. Your organization is responsible for backing up the lcm.db database. You must also configure failover to mirror the lcm.db.