Running a DSE installation job using LCM
Submit a DataStax Enterprise install job to run on a cluster, datacenter, or node in Lifecycle Manager.
Submit a DataStax Enterprise (DSE) installation job to run on a cluster, datacenter, or node. An installation job completes the work of running a configuration job and by default, installing Java and the JCE Policy files required to enable unlimited strength encryption.
The job does not progress to the next node until the
current node successfully restarts (that is, the node is responding on the
native_transport_port
). By default, the job stops prematurely
if a job fails for a single node, to avoid propagating a faulty configuration to an
entire cluster. Jobs that are already running on nodes are allowed to finish, but
the job does not continue running on any remaining nodes for that job. Doing so
prevents any potential configuration problems from bringing down multiple nodes, or
even the entire cluster. If required, override this default behavior with the
Continue on error option, which attempts to continue
running the job on all nodes regardless of failure.
Install jobs that expand an existing cluster are throttled to one node at a time to prevent too much data from streaming concurrently.
Prerequisites
Complete the following tasks before running any install jobs in Lifecycle Manager.
- Create all SSH credentials and define repositories.
- Define configuration profiles.
- Build the cluster topology model or import an existing model.
- Check the clock drift rule in the Best Practice Service to ensure clocks are in sync before proceeding. Clock drift can interfere with LCM generating TLS certificates.
- Ensure that the SSH server on each node allows file
transfer:
- For OpsCenter 6.0.0-6.0.x, 6.1.0-6.1.x, 6.5.0-6.5.3, and 6.7.0, the SSH server on target nodes must allow SFTP transfers.
- For OpsCenter 6.5.4-6.5.x, the SSH server on the target node must allow file transfer by either SCP or SFTP. LCM tries SFTP first and falls back to SCP.
LCM does not create data directories or manage their permissions. See Creating custom data directories for steps to use a custom data directory.
Procedure
- Click Clusters from the Lifecycle Manager navigation menu.
- Select the cluster, datacenter, or node to run an install job on.
-
Click Install from the drop menu.
The Run Installation Job dialog displays.
-
To override the default error behavior and continue running the job on
subsequent nodes until all nodes are finished, select Continue On
Error. The job continues running despite encountering
errors.
By default, a job ceases running on additional nodes upon encountering an error on any given node. Any nodes that are already running continue to completion.
- Optional: Enter a Description about the job.
-
Select an option for Auto Bootstrap. To override the LCM
default, choose True or False as required.
- LCM Default: Following best practices for data
integrity, sets
auto_bootstrap
to True for new nodes, requiring new nodes to be started sequentially. The default job concurrency policy ensures that nodes start sequentially.This default is different from previous OpsCenter versions.
Warning: When adding a node to an existing datacenter that has already been converged (that is, an install job has already been run) in LCM, a tooltip warning appears: New nodes that list themselves as seeds will fail to bootstrap and will require immediately running a repair on the node. DataStax recommends designating the node as a seed after the node has bootstrapped into the cluster. - True: Explicitly sets
auto_bootstrap
to True. - False: Explicitly sets
auto_bootstrap
to False.
For more information, see auto_bootstrap. - LCM Default: Following best practices for data
integrity, sets
-
If running an install job on a cluster or datacenter, select a
Concurrency Level:
Note: Concurrency Level is not applicable to node-level jobs.
- Automatic (safest but not always fastest): Default.
Allows LCM to determine a safe concurrency level to use. Use this option when
unsure which other option would be appropriate.Note: The Automatic option executes one job at a time, both for nodes in datacenters that were previously installed by LCM, and for nodes in new datacenters where an install job has not yet successfully completed. This behavior mirrors the Single node option.
- Single node: Executes job on one node at a time. Use this option when having more than one node offline at a given time would impact availability.
- One node per DC: Executes job concurrently on at most one node per datacenter (DC). Use this option if having a single node in each DC offline does not impact availability.
- Single rack within a DC (might interrupt service): Executes job concurrently on nodes such that at most one rack has nodes down at a time. Use this option if having an entire rack within a DC offline does not impact availability.
- One rack per DC (might interrupt service): Executes job concurrently on nodes such that at most one rack in each DC has nodes down at a time. Use this option if having an entire rack in each DC offline does not impact availability.
- All nodes within a DC (interrupts service): Executes job concurrently on all nodes in a DC. Use this option if having all nodes in a DC offline is acceptable.
- All nodes (interrupts service): Executes a job concurrently on all nodes in a cluster. Use this option if having all nodes in a cluster offline is acceptable.
- Automatic (safest but not always fastest): Default.
Allows LCM to determine a safe concurrency level to use. Use this option when
unsure which other option would be appropriate.
-
If running an installation job on a cluster or datacenter, enter a
Batch Size if the default (10) is not appropriate for
your environment or the selected Concurrency Level
setting.
The batch size is a per job cap on concurrency that only becomes applicable when numerous nodes are eligible for a job run.Note: Batch size takes effect only when a large number of nodes are eligible for concurrent deployment, such as with the All nodes concurrency policy. Batch size has no effect on jobs with the Single node concurrency policy or on node-level jobs.
- Click Submit to submit the job. A dialog indicates that the job is in the queue to run.
- Click View Job Summary to navigate to the Jobs page to monitor the job progress. Click Close if you do not want to immediately monitor the job and prefer to remain in the Clusters workspace.