Run an Install, Configure, and Upgrade Jobs Overview

Jobs are launched from the Clusters workspace of Lifecycle Manager. Monitor install, configure, upgrade, and import jobs in the Jobs workspace of Lifecycle Manager.

opscenterd.conf

The location of the opscenterd.conf file depends on the type of installation:
  • Package installations: /etc/opscenter/opscenterd.conf
  • Tarball installations: install_location/conf/opscenterd.conf

Jobs are launched from the Clusters workspace of Lifecycle Manager. Monitor install, configure, upgrade, and import jobs in the Jobs workspace of Lifecycle Manager.

Lifecycle Manager runs jobs concurrently for different clusters; however, jobs for the same cluster execute sequentially and remain in the pending state while other jobs are currently running.

Importing an unmanaged cluster is also tracked in the Jobs summary and details.

After Lifecycle Manager successfully creates a cluster during an install job, LCM automatically adds the cluster to the OpsCenter workspace for monitoring and management.

Job types 

The primary job types you can run on an entity in the LCM topology model (that is; cluster, datacenter, or node depending on the job type) are:
  • Install Job: An Install job downloads, installs, and configures supported DataStax Enterprise versions onto your pre-launched instances. LCM efficiently skips work that is already completed. For example, DSE is not downloaded again if the correct version of DSE is already installed on a target node, but configure steps are performed if necessary. If a DSE package is already installed but is a different version than specified in the install job, the install job fails. Only an upgrade job supports installing a different DSE version. Install jobs are idempotent operations and can be safely rerun, ensuring your cluster continues to operate according to your desired configuration. If a job fails for some transient reason, it can be rerun and LCM efficiently completes the remaining work. Prior to OpsCenter provisioning with LCM, manually removing all traces of DataStax Enterprise packages from the affected nodes was required after a failed provisioning attempt. An install job runs a configure job in addition to installation.
  • Configure Job: A configure job pushes an associated configuration profile to the appointed nodes and restarts the cluster.
    Note: A configure job requires that an install job was previously run. Running a configure job is not allowed until a cluster is converged by a prior install job.
  • Upgrade Job: An upgrade job downloads and installs DataStax Enterprise and applies the values specified in the associated configuration profile. Upgrade jobs are only allowed at the datacenter and node levels. Upgrades cannot be run at the cluster level because there are frequently context-specific requirements about the order in which datacenters must be upgraded, and steps that must be performed outside LCM to prepare a datacenter for upgrade or recovery afterward. Upgrades to a minor DSE release are supported within a release series. For example, upgrade to a higher patch version within the 5.0.x, 5.1.x, or later major versions. Skipping patch versions during an upgrade is supported. For instance, upgrade from v5.0.1 to v5.0.9. Be sure to also review the DSE upgrade guide to determine which patch version to upgrade to, to plan any required configurations that must be deployed using LCM prior to an upgrade, and to become aware of any steps that must be performed outside of LCM during the upgrade. During the cloning config profile process when a new DSE version is selected, LCM provides complete details on which config profile settings have been added, removed, or had their default settings changed. The notifications also inform you of user-supplied custom values that are still valid and preserved.
    Note: Version downgrades of DataStax Enterprise are not supported within Lifecycle Manager.

Supported OS platform check for DSE installs 

As of OpsCenter version 6.1.3 and later, LCM automatically performs an OS supported platform check for the version of DSE being installed. Success or failures are logged and are also visible when drilling into the Job Details:

Job details showing failed OS platform check for an LCM install job

The OS platform check can be disabled at your own risk by setting the disable_platform_check option to True in the [lifecycle_manager] section of the OpsCenter configuration file, opscenterd.conf.

Concurrency levels for running a job 

Jobs run faster when more nodes are deployed concurrently during a job. Because nodes become unavailable at certain times during the deployment process, higher concurrency levels can interrupt service due to the cluster's ability to respond to queries. The concurrency level options in the LCM job dialogs provide granularity for the aforementioned tradeoff. Concurrency levels are available when running jobs at the cluster or datacenter level.

  • Automatic (safest but not always fastest): Default. Lets LCM determine a safe concurrency level to use. Use this option when unsure which other option would be appropriate.
    Note: The Automatic option behaves similarly to the Single node option for nodes in datacenters that have previously been installed by LCM. For nodes in new datacenters where an install job has not yet been successfully completed, the Automatic strategy behaves similarly to the All nodes options, which are presumed to be safe options because nodes in a datacenter that has not yet finished being installed should not yet be servicing requests.
  • Single node: Executes job on one node at a time. Use this option when having more than one node offline at a given time would impact availability.
  • One node per DC: Executes job concurrently on at most one node per DC. Use this option if having a single node in each DC offline does not impact availability.
  • Single rack within a DC (might interrupt service): Executes job concurrently on nodes such that at most one rack has nodes down at a time. Use this option if having an entire rack within a DC offline does not impact availability.
  • One rack per DC (might interrupt service): Executes job concurrently on nodes such that at most one rack in each DC has nodes down at a time. Use this option if having an entire rack in each DC offline does not impact availability.
  • All nodes within a DC (interrupts service): Executes job concurrently on all nodes in a DC. Use this option if having all nodes in a DC offline is acceptable.
  • All nodes (interrupts service): Executes a job concurrently on all nodes in a cluster. Use this option if having all nodes in a cluster offline is acceptable.
Tip: Run jobs during off-peak hours if using a concurrency level that potentially or definitely interrupts service.

Hover on a list option to view its tooltip description:

Concurrency level options and tooltips for LCM jobs