Performing Cluster Operations¶
Cluster operations include initiating administrative actions on nodes, such as garbage collection, in a Cassandra or DSE cluster, rebalancing a cluster, and managing API requests sent to cluster.
Node Administration Methods¶
- POST /{cluster_id}/ops¶
Initiate a bulk set of operations on one or more nodes
Body: A JSON dictionary with the following keys:
- ips: List of IPs that represent the nodes the operations will run on:
- action: The operation that should be performed on the node. Values
- are (cleanup, compact, flush, perform_gc, repair, restart, start, stop)
- is_rolling: Whether the jobs are running in a rolling or parallel fashion
- sleep: Seconds between each grouping of jobs. Defaults to 60
- args: Arguments in a list, to pass to each operation.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops -d '{"ips":["127.0.0.1"],"action":"cleanup", "is_rolling": true, "sleep": 1, "args":["OpsCenter", "events"]}'
- GET /{cluster_id}/ops/gc/{node_ip}¶
Initiate JVM garbage collection on a Node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – IP address of the target Node.
Returns null.
Example:
curl -X GET http://127.0.0.1:8888/Test_Cluster/ops/gc/1.2.3.4
- cluster_id – The ID of a cluster returned from
- PUT /{cluster_id}/ops/move/{node_ip}¶
Assign a new token to the node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be assigned a new token.
Body: New token to assign to node.
Returns a Request ID.
Example:
curl -X PUT http://127.0.0.1:8888/Test_Cluster/ops/move/10.11.12.72 -d '"85070591730234615865843651857942052864"'
Output:
"72ff69b2-9cf5-4777-a600-9173b3fe7e6a"
- cluster_id – The ID of a cluster returned from
- GET /{cluster_id}/ops/drain/{node_ip}¶
Initiate a drain operation to flush all memtables from the node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be flushed of memtables.
Returns null.
Example:
curl -X GET http://127.0.0.1:8888/Test_Cluster/ops/drain/1.2.3.4
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/decommission/{node_ip}¶
Initiate decommissioning of a node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be decommissioned.
Returns null.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/decommission/1.2.3.4
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/cleanup/{node_ip}/{ks_name}¶
Initiate a cleanup operation for the specified keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates cleaning of the keyspace.
- ks_name – Name of the keyspace to be cleaned.
Body: List of column families to cleanup. If empty, all column families will be cleaned up.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/cleanup/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/flush/{node_ip}/{ks_name}¶
Flush memtables for a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be flushed of memtables for a keyspace.
- ks_name – Keyspace of the memtables to be flushed.
Body: List of column families to flush. If empty, all column families will be flushed.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/flush/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/repair/{node_ip}/{ks_name}¶
Initiates repair of a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates repair.
- ks_name – Keyspace to be repaired.
Body: A JSON dictionary with three keys:
- is_sequential: Required for cassandra 1.1 and up. Will throw an
- error if used with earlier versions. A boolean (0 or 1) indicating whether to run the repair sequentially or not
- is_local: Required for cassandra 1.2 and up. Will throw an error
- if used with earlier versions. A boolean (0 or 1) indicating whether to use only nodes in the same data center during the repair or not. Defaults to False.
- primary_range: Repair just the primary range for that node or else
- else will repair all ranges. A boolean (0 or 1) that defaults to False.
- cfs: List of column families to repair. If this is empty, all
- column families will be repaired.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/repair/1.2.3.4/Keyspace1 -d '{"is_sequential": 1, cfs":["ColFam1", "ColFam2"]}'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/compact/{node_ip}/{ks_name}¶
Initiates a major compaction on a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates the compaction.
- ks_name – Keyspace to be compacted.
Body: List of column families to compact. If this is empty, all column families will be compacted.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/compact/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
Process Management Methods¶
- POST /{cluster_id}/ops/start/{node_ip}¶
Start the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be started.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/start/10.11.12.72
Output:
"a34814a6-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/stop/{node_ip}¶
Stop the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be stopped.
Body: A JSON dictionary with an optional key:
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/stop/10.11.12.72 -d '{"drain_first": true}'
Output:
"c0d81d54-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/restart/{node_ip}¶
Restart the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be restarted.
Body: A JSON dictionary with two optional keys:
- wait_for_thrift: A boolean to wait for thrift to start on restart.
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/restart/10.11.12.72 -d '{"wait_for_thrift": true, "drain_first": true}'
Output:
"e2212500-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/restart¶
Perform a rolling restart of the entire cluster or a select list of nodes.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be restarted.
Body: A JSON dictionary with three optional keys:
- sleep: Amount of time in seconds to sleep between restarting each node. Defaults to 60.
- ips: A list of ips to restart. If left empty, all nodes will be restarted (this is the default behavior).
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/restart
Output:
"e2212500-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
Cluster Rebalancing Methods¶
- GET /{cluster_id}/ops/rebalance¶
Return a list of proposed moves to run to balance a cluster. Will throw an error if called on a cluster using vnodes
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Returns a list of moves, where each move is a token and the IP address of its assigned node. The result of this call is passed to
POST /{cluster_id}/ops/rebalance
.Example
curl http://127.0.0.1:8888/Test_Cluster/ops/rebalance
Output:
[ [ "85070591730234615865843651857942052864", "10.11.12.152" ] ]
- POST /{cluster_id}/ops/rebalance¶
Run the specified list of moves to balance a cluster. Will throw an error if called on a cluster using vnodes
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Opt. params: sleep – An optional number of seconds to wait between each move. Body: A list of moves to run to balance this cluster. This is typically the result of GET /{cluster_id}/ops/rebalance
.Returns a Request ID for determining the status of, or cancelling, a running rebalance.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/rebalance -d '[ [ "85070591730234615865843651857942052864", "10.11.12.152" ] ]'
Output:
"e330b179-1b9f-40c2-a2f5-d2f3d24aa85c"
Cluster Services¶
- GET /{cluster_id}/services¶
Get the status of cluster services.
Returns a dictionary with service names as keys and the status, parameters, and associated activity or progress of the service as the values.
Example
curl "http://localhost:8888/Test_Cluster/services"
{ "repair": { "progress": { "completed": 26, "total": 256 }, "status": { "parameters": { "time_to_completion": 100000 }, "status": true } } }
Cluster Repair Service¶
- POST /{cluster_id}/services/repair¶
Run the cluster repair service with the given parameters.
Body: A dictionary of repair service parameters.
- time_to_completion: The time in seconds to complete one complete cycle of the repair
- service; i.e. the time to complete a repair of the entire cluster
- DELETE /{cluster_id}/services/repair¶
Stop the cluster repair service.
- GET /{cluster_id}/services/repair¶
Get the status of the repair service.
Returns a dictionary describing the status and parameters of the service.
Example
curl "http://127.0.0.1:8888/Test_Cluster/services/repair"
{ "status": true, "parameters": {"time_to_completion": 100000} }
- GET /{cluster_id}/services/repair/progress¶
Get the progress of the repair service.
Returns a dictionary containing the total number of repair jobs in the current repair cycle and the number of completed jobs.
Example
curl "http://127.0.0.1:8888/Test_Cluster/services/repair/progress"
{ "completed": 76, "total": 256 }
- GET /{cluster_id}/services/repair/invalid_keyspaces¶
Get the list of invalid keyspaces for the repair service.
Returns a list containing the keyspace names of anything that’s considered invalid. That is any keyspaces that are SimpleStrategy if you are running in a multi-datacenter environment. If there’s not multiple datacenters, it returns an empty list.
Example
curl "http://127.0.0.1:8888/Test_Cluster/services/repair/invalid_keyspaces"
[ "Keyspace1", "Keyspace2" ]
Request Management Methods¶
- Request¶
Requests are the method that OpsCenter uses to track potentially long-running requests that must be completed asynchronously. When these potentially long-running API calls are made, opscenterd will immediately return a Request ID that can be used to look up the status of the request.
Once a Request is started, you can fetch the status information for it until opscenterd is restarted or a large number of Requests have been started.
A Request status takes the following form:
{ "id": ID, "state": STATE, "started": STARTED, "finished": FINISHED, "cluster_id": CLUSTER_ID, "details": DETAILS }
Data: - ID (string) – The unique UUID for this Request. When an operation is potentially long-running, opscented will return this ID immediately.
- STATE (string) – Either “running”, “success”, or “error”
- STARTED (int) – A unix timestamp representing when the Request started
- FINISHED (int) – A unix timestamp representing when the Request finished, or null if it has not finished yet
- CLUSTER_ID (string) – The name of the cluster that the Request is operating on
- DETAILS – Typically a string containing a status or error message, but
may be a dictionary in the form
{<subrequest_id>: <Request>}
when the Request holds a collection of subrequests.
Content Types: - JSON
- GET /request/{request_id}/status¶
Check the status of an asynchronous request sent to OpsCenter.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_id – The ID returned by the API call that triggered the request.
Return a dictionary describing the status of the request.
Example
curl http://127.0.0.1:8888/request/6b6b15aa-df8a-43f1-aab3-efce6b8589e4/status
{ "status": "running", "started": 1334856122, "error_message": null, "finished": null, "moves": [ { "status": null, "ip": "10.100.100.100", "old": "2", "new": "85070591730234615865843651857942052864" } ], "id": "6b6b15aa-df8a-43f1-aab3-efce6b8589e4" }
- cluster_id – The ID of a cluster returned from
- POST /request/{request_id}/cancel¶
Cancel an asynchronous request sent to OpsCenter. Not all requests can be cancelled.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_id – The ID returned by the API call that triggered the request.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/request/6b6b15aa-df8a-43f1-aab3-efce6b8589e4/cancel
The request is canceled.
- cluster_id – The ID of a cluster returned from
- GET /{cluster_id}/request/{request_type}¶
List requests of a particular type. Defaults to the latest request of that type.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_type – Either “rolling-restart”, “restore” or “bulk-operations”
Query params: list_all – A boolean (0 or 1) indicating whether all of the requests should be returned or just the latest. Defaults to zero (false).
Returns a Request ID. If list is true, then an array of Request IDs
Example
curl -X GET http://127.0.0.1:8888/Test_Cluster/request/rolling-restart
"8f4f71e7-65d3-41a7-bb1a-789af07dbd73"
curl -X GET http://127.0.0.1:8888/Test_Cluster/request/rolling-restart?list_all=1
[ "35dd37a5-4170-4694-9253-faa9532d47b6", "8f4f71e7-65d3-41a7-bb1a-789af07dbd73" ]
- cluster_id – The ID of a cluster returned from