Monitoring job status

Lifecycle Manager (LCM) provides deep transparency into currently running, completed, and failed jobs. Drill into details of a failed job to conveniently troubleshoot the root cause of the failure from within the Jobs workspace before resorting to reviewing logs. Investigate any failed jobs or those that are taking an excessive time to run.

Additionally, LCM includes the following monitoring capabilities:

  • View a summary of install, configure, upgrade, and import jobs.

  • View a summary of the status of all jobs and details about the job type, target, and its job ID.

  • Monitor the progress of a running job, and abort a job that might be taking excessive time to execute.

The status of each job is clearly indicated with icons and descriptions in the Status column of the Jobs workspace. Refer to the following table for further details:

Table 1. Job status legend
Status Icon Description

Not Run

lifecycleJobNotRun

The initial install job has not been run. This indicator displays in the Clusters workspace.

Pending

lifecycleJobPending

The job is in the queue waiting to run.

Running

lifecycleJobRunning

The job is currently running.

Success (Completed)

lifecycleJobComplete

The job ran successfully.

Failure

lifecycleJobFailed

The job failed. Investigate the issue by drilling into the job details. Try running the job again.

An ORPHANED status only appears in logs upon startup if there were any jobs left in a RUNNING status.

An ORPHANED status indicates a job failed because OpsCenter was restarted while a job was running.

A WILL_FAIL status in the logs indicates that a job was marked early in processing as guaranteed to fail, which might be informative from an API troubleshooting perspective.

The ORPHANED and WILL_FAIL statuses appear only in the logs and do not appear in the UI.

Idle

Idle

A job was actively running but has at least one node that failed to recently report progress. An idle job is still running and will never automatically fail, since a node could be successfully executing a slow operation. To stop a job in an idle status, you must manually abort the job.

The defaults for timing out a job and marking it as idle can be changed with idle timeout configuration options in the [lifecycle_manager] section of opscenterd.conf. The location of this file depends on the type of installation:

  • Package installations: /etc/opscenter/opscenterd.conf

  • Tarball installations: install_location/conf/opscenterd.conf

Aborted

Abort

The job was manually aborted. Aborted jobs appear in logs with a TERMINATED status.