Running a DSE installation job using LCM

Submit a DataStax Enterprise install job to run on a cluster, datacenter, or node in Lifecycle Manager.

Submit a DataStax Enterprise install job to run on a cluster, datacenter, or node. An install job includes running a configuration job and by default, installing Java and the JCE Policy files required to enable unlimited strength encryption.

Lifecycle Manager pushes configuration jobs to a single node at a time and restarts DataStax Enterprise on that node. For a newly added cluster, the very first install job runs on several concurrent nodes. The concurrency speeds the initial install process and is safe because the new cluster cannot serve clients until it has been installed for the first time. After that, install and configure jobs proceed one-node-at-a-time to ensure cluster availability. The job does not progress to the next node until the current node successfully restarts (that is, the node is responding on the native_transport_port). By default, the job gracefully stops prematurely if a job fails for a single node. Jobs that are already running on nodes are allowed to finish, but the job does not continue running on any remaining nodes for that job. Doing so prevents any potential configuration problems from bringing down multiple nodes, or even the entire cluster. If required, override this default behavior with the Continue on error option, which attempts to continue running the job on all nodes regardless of failure.

Install jobs that expand an existing cluster are throttled to one node at a time to prevent too much data from streaming concurrently.

Note: If the version of DataStax Enterprise associated with a configuration profile being pushed differs from the installed version, the configuration job fails. To upgrade to a minor DSE version, clone the configuration profile and run an upgrade job. Version downgrades of DSE are not supported within Lifecycle Manager.

Prerequisites

  • All credentials (SSH and repositories) must be created, configuration profiles defined, and a cluster topology model built or imported before running any install jobs in Lifecycle Manager.
  • LCM does not create data directories or manage their permissions on your behalf. If you want to use a custom data directory, please ensure that it exists and is owned by the cassandra user.
  • An installed version of Python 2.6 through 2.7 is required on the target nodes. LCM does not automate the installation of Python. Install jobs fail if Python is not installed.
  • Ensure that there is not any clock drift, which can interfere with LCM generating TLS certificates. Check the clock drift rule in the Best Practice Service to ensure clocks are in sync before proceeding.

Procedure

  1. Click Clusters from the Lifecycle Manager navigation menu.
  2. Select the cluster, datacenter, or node to run an install on.
  3. Click Install from the drop menu.
    The Run Installation Job dialog appears.

    Run Install (DSE) job dialog in LCM

  4. To override the default error behavior and continue running the job on subsequent nodes until all nodes are finished, select Continue on error. The job continues running despite encountering errors.
    By default, a job ceases running on additional nodes upon encountering an error on any given node. Any nodes that are already running continue to completion.
  5. Optional: Enter a Description about the job.
  6. Select an option for Auto Bootstrap. To override the LCM smart default, choose True or False as required.
    • LCM Default: Sets the option depending on actions within the job. When adding a cluster or datacenter, sets auto_bootstrap to False. When adding nodes to an existing datacenter, sets auto_bootstrap to True.
      Warning: When adding a node to an existing datacenter that has already been converged (that is, an install job has already been run) in LCM, a tooltip warning appears: New nodes that list themselves as seeds will fail to bootstrap and will require immediately running a repair on the node. DataStax recommends designating the node as a seed after the node has bootstrapped into the cluster.

      Adding a node to an existing datacenter seed node warning

    • True: Explicitly sets auto_bootstrap to True.
    • False: Explicitly sets auto_bootstrap to False.
  7. If running an install job on a cluster or datacenter, select a Concurrency Level:
    Note: Concurrency Level is not applicable to node-level jobs.
    • Automatic (safest but not always fastest): Default. Lets LCM determine a safe concurrency level to use. Use this option when unsure which other option would be appropriate.
      Note: The Automatic option behaves similarly to the Single node option for nodes in datacenters that have previously been installed by LCM. For nodes in new datacenters where an install job has not yet been successfully completed, the Automatic strategy behaves similarly to the All nodes options, which are presumed to be safe options because nodes in a datacenter that has not yet finished being installed should not yet be servicing requests.
    • Single node: Executes job on one node at a time. Use this option when having more than one node offline at a given time would impact availability.
    • One node per DC: Executes job concurrently on at most one node per DC. Use this option if having a single node in each DC offline does not impact availability.
    • Single rack within a DC (might interrupt service): Executes job concurrently on nodes such that at most one rack has nodes down at a time. Use this option if having an entire rack within a DC offline does not impact availability.
    • One rack per DC (might interrupt service): Executes job concurrently on nodes such that at most one rack in each DC has nodes down at a time. Use this option if having an entire rack in each DC offline does not impact availability.
    • All nodes within a DC (interrupts service): Executes job concurrently on all nodes in a DC. Use this option if having all nodes in a DC offline is acceptable.
    • All nodes (interrupts service): Executes a job concurrently on all nodes in a cluster. Use this option if having all nodes in a cluster offline is acceptable.
  8. If running an installation job on a cluster or datacenter, enter a Batch Size if the default (10) is not appropriate for your environment or the selected Concurrency Level setting.
    The batch size is a per job cap on concurrency that only becomes applicable when numerous nodes are eligible for a job run.
    Note: Batch size takes effect only when a large number of nodes are eligible for concurrent deployment, such as with the All nodes concurrency policies. Batch size has no effect on jobs with the single-node concurrency policy or on node-level jobs.
  9. Click Submit.
    The job is submitted. A dialog informs you the job is in the queue to run.
  10. Click View Job Summary to navigate quickly to the Jobs page to monitor the job progress. Click Close if you do not want to immediately monitor the job and prefer to remain in the Clusters workspace.