Jobs¶
While the rest of the LCM API simply defines what the cluster should be, jobs are what actually make it so. Jobs can generally be run at either the cluster, datacenter, or node level.
Job Type | URL |
---|---|
Install | POST /api/v2/lcm/actions/install |
Configure | POST /api/v2/lcm/actions/configure |
Upgrade | POST /api/v2/lcm/actions/upgrade |
Restart | POST /api/v2/lcm/actions/restart |
Terminate | POST /api/v2/lcm/actions/terminate |
To get a listing of job types and what options they support, you can go to the following API route:
- GET /api/v2/lcm/actions/¶
All job creation is asynchronous and should return a response immediately with the created Job object. This only means that the job has been queued. The job may or may not start immediately depending on whether LCM is already running a job.
- Job¶
{ "node-id": null, "description": null, "datacenter-name": null, "datacenter-id": null, "cluster-name": "c1", "job-data": { "continue-on-error": false, "job-type": "install", "auto-bootstrap": null }, "job-type": "install", "node-name": null, "id": "dad24ef7-fa09-49d8-9070-0406c7a6c585", "cluster-id": "7064583a-0656-4914-819e-937863793ab5", "status": "PENDING", "dry-run": false }
Property Description of Values id A UUID for the Job. job-type The job type ( install
,configure
, etc.)status The current Job Status. cluster-id A reference to the cluster for this job. cluster-name The name of the cluster (at the time the job was created). datacenter-id The datacenter of the job (only set if the job was run at the datacenter or node level). datacenter-name The name of the datacenter (at the time the job was created). node-id The node for the job (only set if the job was run for a specific node). node-name The name of the node (at the time the job was created). description A verbal description of the job. Optional. dry-run Not currently used.
- GET /api/v2/lcm/jobs/{id}¶
Gets a Job by job ID.
Path arguments: id – A Job ID. Returns a Job object.
Example:
curl http://localhost:8888/api/v2/lcm/jobs/27138a41-59b6-487b-892a-3a211201fb51
Output:
{ "node-id": null, "created-on": "2016-06-21T17:21:30.357Z", "description": null, "datacenter-name": null, "type": "job", "datacenter-id": null, "cluster-name": "c1", "job-data": { "job-type": "install", "auto-bootstrap": null }, "related-resources": { "job-events": "http://localhost:8888/api/v2/lcm/jobs/27138a41-59b6-487b-892a-3a211201fb51/job_events/", "cluster": "http://localhost:8888/api/v2/lcm/clusters/7064583a-0656-4914-819e-937863793ab5", "job-nodes": "http://localhost:8888/api/v2/lcm/jobs/27138a41-59b6-487b-892a-3a211201fb51/job_nodes/" }, "modified-by": "system", "job-type": "install", "node-name": null, "modified-on": "2016-06-21T17:21:32.596Z", "id": "27138a41-59b6-487b-892a-3a211201fb51", "href": "http://localhost:8888/api/v2/lcm/jobs/27138a41-59b6-487b-892a-3a211201fb51", "created-by": "system", "cluster-id": "7064583a-0656-4914-819e-937863793ab5", "status": "FAILED", "dry-run": false }
- GET /api/v2/lcm/jobs/¶
Gets a paginated list of all Job records. See Paginated Results for an overview of the query string parameters that can be used.
Example:
curl http://localhost:8888/api/v2/lcm/jobs/
Output:
{ "next": null, "previous": null, "last": 1, "count": 1, "per-page": 50, "current": 1, "results": [ { "node-id": null, "created-on": "2016-06-21T20:24:21.432Z", "description": null, "datacenter-name": null, "type": "job", "datacenter-id": null, "cluster-name": "cluster01", "related-resources": { "job-events": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a/job_events/", "cluster": "http://localhost:8888/api/v2/lcm/clusters/39efe91a-86bf-44a9-a5b8-fed6d37b8e21", "job-nodes": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a/job_nodes/" }, "job-type": "install", "node-name": null, "modified-on": "2016-06-21T20:25:08.787Z", "id": "53db1340-8037-47fe-817a-951b9324b04a", "href": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a", "cluster-id": "39efe91a-86bf-44a9-a5b8-fed6d37b8e21", "status": "FAILED", "dry-run": false } ] }
- Job Node¶
Job Node objects represent the status of the job on a specific node.
{ "job-id": "53db1340-8037-47fe-817a-951b9324b04a", "node-id": "3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "datacenter-name": "datacenter01", "rack-name": "rack1", "seed": true, "node-name": "10.0.3.132", "id": "8a7f8efc-fe35-4541-a570-4d235fcf1438", "status": "FAILED" }
Property Description of Values id A UUID for the Job Node object. job-id The UUID of the job. node-id The UUID of the node. node-name The name of the node (at the time of job creation). datacenter-name The name of datacenter (at the time of job creation). rack-name The name of the node’s rack (at the time of job creation). seed Boolean indicating if the node is a seed node (at the time of job creation). status The job status for this node.
- GET /api/v2/lcm/job_nodes/{id}¶
Gets a specific job node record by ID.
arg id: A Job Node ID. Returns a Job Node object.
Example:
curl http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438
Output:
{ "job-id": "53db1340-8037-47fe-817a-951b9324b04a", "node-id": "3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "created-on": "2016-06-21T20:24:26.419Z", "datacenter-name": "datacenter01", "rack-name": "rack1", "seed": true, "type": "job-node", "related-resources": { "job-events": "http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438/job_events/", "node": "http://localhost:8888/api/v2/lcm/nodes/3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "job": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a" }, "modified-by": "system", "node-name": "10.0.3.132", "modified-on": "2016-06-21T20:25:08.738Z", "id": "8a7f8efc-fe35-4541-a570-4d235fcf1438", "href": "http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438", "created-by": "system", "status": "FAILED" }
- GET /api/v2/lcm/job_nodes/¶
Gets a paginated list of all Job Node records. See Paginated Results for an overview of the query string parameters that can be used.
Example:
curl http://localhost:8888/api/v2/lcm/job_nodes/
Output:
{ "next": null, "previous": null, "last": 1, "count": 1, "per-page": 50, "current": 1, "results": [ { "job-id": "53db1340-8037-47fe-817a-951b9324b04a", "node-id": "3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "created-on": "2016-06-21T20:24:26.419Z", "datacenter-name": "datacenter01", "type": "job-node", "related-resources": { "job-events": "http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438/job_events/", "node": "http://localhost:8888/api/v2/lcm/nodes/3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "job": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a" }, "node-name": "10.0.3.132", "modified-on": "2016-06-21T20:25:08.738Z", "id": "8a7f8efc-fe35-4541-a570-4d235fcf1438", "href": "http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438", "status": "FAILED" } ] }
Job Status¶
The status field of Job objects can be one of the following values:
Status Description PENDING The job has been queued for execution, but is not yet running. RUNNING The job is currently running. FAILED The job has failed. WILL_FAIL The job is still running, but we know that it will ultimately fail. COMPLETE The job finished successfully. IDLE The job is running, but hasn’t reported any activity within a configurable timeout. The job may be hung or may simply be running slowly. In the latter case, it will eventually flip back to RUNNING. TERMINATED The job was terminated by request. ORPHANED If the OpsCenter server is restarted while a job is RUNNING, the job will be given this status on startup.
- Job Event¶
Each node will report status and other useful information to LCM as the job executes on the remote host. All the non-status information is captured in a Job Event object. These are used to track changes, milestones, and errors on the node. They are very useful for troubleshooting.
{ "id": <value>, "job-id": <value>, "node-id": <value>, "event-type": <value>, "event-subtype": <value>, "before": {"<key>": <value>}, "changes": <value>, "message": <value>, "event-resource": <value>, "after": {"key": <value>}, "traceback": <value> }
Property Description of Values id A UUID for the Job Event. job-id The UUID of the Job. node-id The UUID of the Node. event-type The main type of the event (ie, “milestone”, “error”, etc.) event-subtype The subtype of the event (“MeldError”, “file-contents”, etc.) event-resource The resource related to the event. Often this is a file path, but could be other things and is sometimes null. message A description of the event. changes A boolean indicating whether a change was made on the remote machine. This is often used to indicate whether a config file was overwritten. before An object sometimes used to indicate some node state before the event happened. after An object sometimes used to indicate some node state after the event happened. traceback Contains a program stacktrace, if there is one. This is only present when event-type is “error”.
- GET /api/v2/lcm/job_events/{id}¶
Gets a specific job event record by ID.
Path arguments: id – A Job Event ID. Returns a Job Event object.
Example:
curl http://localhost:8888/api/v2/lcm/job_events/65c12a12-1e71-40db-aad6-2d1bedeb0f98
Output:
{ "job-id": "53db1340-8037-47fe-817a-951b9324b04a", "node-id": "3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "event-subtype": "file-contents", "before": { }, "created-on": "2016-06-21T20:24:28.552Z", "event-type": "check", "changes": false, "type": "job-event", "message": "Checking if the contents of /etc/apt/sources.list.d/opsc.list need to be updated", "related-resources": { "job-node": "http://localhost:8888/api/v2/lcm/job_nodes/8a7f8efc-fe35-4541-a570-4d235fcf1438", "node": "http://localhost:8888/api/v2/lcm/nodes/3bfa4fdf-a861-4ac2-aff1-4730f90af28c", "job": "http://localhost:8888/api/v2/lcm/jobs/53db1340-8037-47fe-817a-951b9324b04a" }, "event-resource": "/etc/apt/sources.list.d/opsc.list", "href": "http://localhost:8888/api/v2/lcm/job_events/65c12a12-1e71-40db-aad6-2d1bedeb0f98", "after": { }, "id": "65c12a12-1e71-40db-aad6-2d1bedeb0f98", "traceback": null, "created-by": "system" }
- GET /api/v2/lcm/job_events/¶
Gets a paginated list of all Job Event records. See Paginated Results for an overview of the query string parameters that can be used.
Example:
curl http://localhost:8888/api/v2/lcm/job_events/
Output:
{ "next": null, "previous": null, "last": 1, "count": 29, "per-page": 50, "current": 1, "results": [ { "job-id": "dad24ef7-fa09-49d8-9070-0406c7a6c585", "node-id": null, "event-subtype": "start", "created-on": "2016-06-21T16:20:01.848Z", "event-type": "milestone", "type": "job-event", "message": "job started...", "related-resources": { "job": "http://localhost:8888/api/v2/lcm/jobs/dad24ef7-fa09-49d8-9070-0406c7a6c585" }, "href": "http://localhost:8888/api/v2/lcm/job_events/15890710-a1bb-42c6-8d49-738ee9c3f844", "id": "15890710-a1bb-42c6-8d49-738ee9c3f844" } ] }
Common Job Parameters¶
The following parameters are available on Install, Configure, and Restart jobs:
Parameter | Description |
---|---|
job-scope | What level this job applies to: cluster, datacenter, or node. |
resource-id | The ID of the cluster, datacenter, or node on which the job will run. |
description | (Optional) A user-defined description of the job. |
continue-on-error | (Optional) Whether LCM should continue processing additional nodes when one node fails. |
concurrency-strategy | (Optional) Determines what nodes are eligible for LCM to deploy to simultaneously. Note that during deployment, nodes are generally restarted and become unavailable for some period of time. Some concurrency-strategies will impact cluster availability and are only suitable for use in development clusters or on clusters that have been taken out of service. ‘default’ (or no value specified) behaves similarly to node-per-cluster-at-a-time for nodes in datacenters that have previously been installed by LCM. For nodes in new datacenters, where an install job has not yet been successfully completed, this strategy behaves similarly to cluster-at-a-time, which is presumed to be safe since a datacenter which has not yet finished being installed should not yet be servicing requests. ‘rack-per-dc-at-a-time’ ensures that at most one rack in each datacenter is offline at a time. Depending on cluster replication and token arrangement, this strategy may or may not affect cluster availability. ‘rack-per-cluster-at-a-time’ ensures that at most one rack within a single datacenter in the cluster is offline at a time. Depending on cluster replication and token arrangement, this strategy may or may not affect cluster availability. ‘cluster-at-a-time’ allows for every node in the cluster to be down at the same time. This strategy will affect cluster availability and should not be used on clusters that are actively servicing traffic. ‘dc-at-a-time’ ensures at most one datacenter is offline at a time. This strategy will affect the availability of the selected datacenter, and should not be used for datacenters that are actively servicing traffic. ‘node-per-dc-at-a-time’ ensures at most one node per datacenter is offline at a time. This strategy will not impact cluster availability, and is a good choice for clusters that have not been designed to withstand the loss of a rack without affecting availability. ‘node-per-cluster-at-a-time’ ensures at most one node in the cluster is offline at a time. This strategy will not impact cluster availability, although in most cases node-per-dc-at-a-time is equally safe and will result in jobs completing more quickly. |
batch-size | (Optional) Maximum number of nodes that this job will concurrently execute on. In conjunction with concurrency-strategy, determines how many nodes LCM will deploy to simultaneously. The concurrency-strategy determines the number of nodes eligible to run simultaneously. If the number of eligible nodes is larger than batch-size (as might be the case for cluster-at-a-time), the batch-size will cap the number of nodes being deployed-to simultaneously in order to prevent overloading network bandwidth or the LCM server. If the number of eligible nodes as determined by the concurrency strategy is lower than batch-size (as will be the case for node-per-cluster-at-a-time), then batch-size will have no effect. |
Install¶
Installs and configures DSE on a set of nodes. Initially this only makes sense at the cluster level, but can be used on an existing cluster to add nodes (re-balancing will be necessary afterward). It is safe to re-run install on a node.
- POST /api/v2/lcm/actions/install¶
This request will return immediately while the install job runs asynchronously.
The following parameter is available for install jobs in addition to the Common Job Parameters.
Parameter Description auto-bootstrap You can explicitly set the auto_bootstrap value for the nodes being installed. Example:
Input:
{ "job-type":"install", "job-scope":"cluster", "resource-id":"7064583a-0656-4914-819e-937863793ab5", "auto-bootstrap":null, "continue-on-error":false, "concurrency-strategy": "default" }
curl -X POST -d '<example input>' http://localhost:8888/api/v2/lcm/actions/install
Output:
{ "node-id": null, "created-on": "2016-06-21T16:20:00.955Z", "description": null, "datacenter-name": null, "type": "job", "datacenter-id": null, "cluster-name": "c1", "job-data": { "continue-on-error": false, "job-type": "install", "auto-bootstrap": null }, "modified-by": "system", "job-type": "install", "node-name": null, "modified-on": "2016-06-21T16:20:00.955Z", "id": "dad24ef7-fa09-49d8-9070-0406c7a6c585", "created-by": "system", "cluster-id": "7064583a-0656-4914-819e-937863793ab5", "status": "PENDING", "dry-run": false }
Configure¶
The only difference between install and configure is that configure jobs do not attempt to install packages, set the CQL password in DataStax Enterprise, or register the cluster with OpsCenter. It pushes the current configs to the managed nodes and restarts them.
For more information on all the parameters available and a description of each, make a GET request to /api/v1/lcm/actions/configure
- POST /api/v2/lcm/actions/configure¶
The parameters that are available for Configure jobs are described at Common Job Parameters.
Example:
Input:
{ "job-type":"configure", "job-scope":"cluster", "resource-id":"7064583a-0656-4914-819e-937863793ab5", "concurrency-strategy": "default" }
curl -X POST -d '<example input>' http://localhost:8888/api/v2/lcm/actions/configure
The output is similar to
POST /api/v2/lcm/actions/install
.
Upgrade¶
Upgrade jobs allow a cluster to be upgraded to a newer version of DSE. However, to avoid downtime, this job is not allowed to be run at the cluster level. You must provide a job-scope of ‘datacenter’ or ‘node’. Also be aware that profile cloning (see Cloning Configuration Profiles) should be done prior to this, and the datacenter or node being upgraded will need to be assigned the new profile. Please be sure to read the OpsCenter documentation on how to Upgrade DSE through LCM. There are many potential pitfalls if you do not perform the proper steps to prepare for upgrade.
For more information on all the parameters available and a description of each, make a GET request to /api/v2/lcm/actions/upgrade
- POST /api/v2/lcm/actions/upgrade¶
Example:
Input:
{ "job-type": "upgrade", "job-scope": "datacenter", "resource-id": "7064583a-0656-4914-819e-937863793ab5" }
curl -X POST -d '<example input>' http://localhost:8888/api/v2/lcm/actions/upgrade
The output is similar to
POST /api/v2/lcm/actions/install
.
Restart¶
Does a rolling restart of the nodes specified.
- POST /api/v2/lcm/actions/restart¶
The parameters that are available for Restart jobs are described at Common Job Parameters.
Example:
Input:
{ "job-type":"configure", "job-scope":"cluster", "resource-id":"7064583a-0656-4914-819e-937863793ab5", "concurrency-strategy": "default" }
Example:
Input:
{ "job-type":"restart", "job-scope":"cluster", "resource-id":"7064583a-0656-4914-819e-937863793ab5" }
curl -X POST -d '<example input>' http://localhost:8888/api/v2/lcm/actions/restart
The output is similar to
POST /api/v2/lcm/actions/install
.
Terminate¶
Technically not a job, this action is for killing jobs that are pending or running.
- POST /api/v2/lcm/actions/terminate¶
The following parameters are available for the terminate action:
Parameter Description job-id The UUID of the job to terminate. reason An optional description of why the job is being terminated. Example:
curl -X POST -d '{"job-id":"27138a41-59b6-487b-892a-3a211201fb51"}' http://localhost:8888/api/v2/lcm/actions/terminate
Output:
"Termination in progress"