Performing Cluster Operations¶
Cluster operations include initiating administrative actions on nodes, such as garbage collection, in a Cassandra or DSE cluster, rebalancing a cluster, and managing API requests sent to cluster.
Node Administration Methods¶
- POST /{cluster_id}/ops¶
Initiate a bulk set of operations on one or more nodes
Body: A JSON dictionary with the following keys:
- ips: List of IPs that represent the nodes the operations will run on:
- action: The operation that should be performed on the node. Values
- are (cleanup, compact, flush, perform_gc, repair, restart, start, stop)
- is_rolling: Whether the jobs are running in a rolling or parallel fashion
- sleep: Seconds between each grouping of jobs. Default is 60
- args: Arguments in a list, to pass to each operation.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops -d '{"ips":["127.0.0.1"],"action":"cleanup", "is_rolling": true, "sleep": 1, "args":["OpsCenter", "events"]}'
- GET /{cluster_id}/ops/gc/{node_ip}¶
Initiate JVM garbage collection on a Node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – IP address of the target Node.
Returns null.
Example:
curl -X GET http://127.0.0.1:8888/Test_Cluster/ops/gc/1.2.3.4
- cluster_id – The ID of a cluster returned from
- PUT /{cluster_id}/ops/move/{node_ip}¶
Assign a new token to the node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be assigned a new token.
Body: New token to assign to node.
Returns a Request ID.
Example:
curl -X PUT http://127.0.0.1:8888/Test_Cluster/ops/move/10.11.12.72 -d '"85070591730234615865843651857942052864"'
Output:
"72ff69b2-9cf5-4777-a600-9173b3fe7e6a"
- cluster_id – The ID of a cluster returned from
- GET /{cluster_id}/ops/drain/{node_ip}¶
Initiate a drain operation to flush all memtables from the node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be flushed of memtables.
Returns null.
Example:
curl -X GET http://127.0.0.1:8888/Test_Cluster/ops/drain/1.2.3.4
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/decommission/{node_ip}¶
Initiate decommissioning of a node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be decommissioned.
Returns null.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/decommission/1.2.3.4
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/cleanup/{node_ip}/{ks_name}¶
Initiate a cleanup operation for the specified keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates cleaning of the keyspace.
- ks_name – Name of the keyspace to be cleaned. If empty, all keyspaces will be cleaned up
Body: List of tables to cleanup. If empty, all tables will be cleaned up.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/cleanup/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/flush/{node_ip}/{ks_name}¶
Flush memtables for a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be flushed of memtables for a keyspace.
- ks_name – Keyspace of the memtables to be flushed. If empty, all keyspaces will be cleaned up
Body: List of tables to flush. If empty, all tables will be flushed.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/flush/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/repair/{node_ip}/{ks_name}¶
Initiates repair of a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates repair.
- ks_name – Keyspace to be repaired.
Body: A JSON dictionary with the following keys:
- is_sequential: A boolean indicating whether to run the repair sequentially or
- not, default is true.
- is_local: A boolean indicating whether to use only nodes in the same
- datacenter during the repair or not. Default is false.
- primary_range: Repair just the primary range for that node or else
- will repair all ranges. A boolean, default is false.
- cfs: List of tables (column families) to repair. If this is empty, all tables
- will be repaired.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/repair/1.2.3.4/Keyspace1 -d '{"is_sequential": false, cfs":["ColFam1", "ColFam2"]}'
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/compact/{node_ip}/{ks_name}¶
Initiates a major compaction on a keyspace.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node that initiates the compaction.
- ks_name – Keyspace to be compacted. If empty, all keyspaces will be cleaned up
Body: List of tables to compact. If this is empty, all tables will be compacted.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/compact/1.2.3.4/Keyspace1 -d '["ColFam1", "ColFam2"]'
- cluster_id – The ID of a cluster returned from
Process Management Methods¶
- POST /{cluster_id}/ops/start/{node_ip}¶
Start the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be started.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/start/10.11.12.72
Output:
"a34814a6-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/stop/{node_ip}¶
Stop the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be stopped.
Body: A JSON dictionary with an optional key:
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/stop/10.11.12.72 -d '{"drain_first": true}'
Output:
"c0d81d54-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/restart/{node_ip}¶
Restart the Cassandra/DSE process on a single node.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be restarted.
Body: A JSON dictionary with two optional keys:
- wait_for_cassandra: A boolean that waits until DSE is fully started before completing the request asynchronously.
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/restart/10.11.12.72 -d '{"wait_for_cassandra": true, "drain_first": true}'
Output:
"e2212500-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/ops/restart¶
Perform a rolling restart of the entire cluster or a select list of nodes.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - node_ip – Node to be restarted.
Body: A JSON dictionary with three optional keys:
- sleep: Amount of time in seconds to sleep between restarting each node. Default is 60.
- ips: A list of ips to restart. If left empty, all nodes will be restarted (this is the default behavior).
- drain_first: A boolean to first perform a drain operation before stopping a node.
Returns a Request ID.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/restart
Output:
"e2212500-4896-11e2-a563-e0b9a54a6d93"
- cluster_id – The ID of a cluster returned from
Cluster Rebalancing Methods¶
- GET /{cluster_id}/ops/rebalance¶
Return a list of proposed moves to run to balance a cluster. Will throw an error if called on a cluster using vnodes
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Returns a list of moves, where each move is a token and the IP address of its assigned node. The result of this call is passed to
POST /{cluster_id}/ops/rebalance
.Example
curl http://127.0.0.1:8888/Test_Cluster/ops/rebalance
Output:
[ [ "85070591730234615865843651857942052864", "10.11.12.152" ] ]
- POST /{cluster_id}/ops/rebalance¶
Run the specified list of moves to balance a cluster. Will throw an error if called on a cluster using vnodes
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Opt. params: sleep – An optional number of seconds to wait between each move. Body: A list of moves to run to balance this cluster. This is typically the result of GET /{cluster_id}/ops/rebalance
.Returns a Request ID for determining the status of, or cancelling, a running rebalance.
Example
curl -X POST http://127.0.0.1:8888/Test_Cluster/ops/rebalance -d '[ [ "85070591730234615865843651857942052864", "10.11.12.152" ] ]'
Output:
"e330b179-1b9f-40c2-a2f5-d2f3d24aa85c"
Cluster Services¶
- GET /{cluster_id}/services¶
Get the status of cluster services.
Returns a dictionary with service names as keys and the status, parameters, and associated activity or progress of the service as the values.
Example
curl "http://localhost:8888/Test_Cluster/services"
{ "repair": { "progress": { "completed": 26, "total": 256 }, "status": { "parameters": { "time_to_completion": 100000 }, "status": true } } }
Cluster Repair Service¶
- POST /{cluster_id}/services/repair¶
Start the cluster repair service with the given parameters.
Body: A dictionary of repair service parameters.
- time_to_completion: The time in seconds to complete a repair cycle of the entire
- cluster. For example, 864000 (10 days).
- DELETE /{cluster_id}/services/repair¶
Stop the cluster repair service.
- GET /{cluster_id}/services/repair¶
Get the status of the repair service.
Returns a dictionary describing the status and parameters of the service.
Example
curl "http://127.0.0.1:8888/Test_Cluster/services/repair"
{ "status": true, "parameters": {"time_to_completion": 100000} }
- GET /{cluster_id}/repair-status¶
Get a status summary of the repair service progress.
Returns a progress summary for the current repair cycle. Includes statistics on pending, in progress, any errors, and completed repairs in total.
Example
curl "http://127.0.0.1:8888/Test_Cluster/repair-status"
{ "config": { "cluster_stabilization_period": "30", "error_logging_window": "86400", "ignore_keyspaces": "", }, "status": "active", "time_to_completion": 777600 "overview": { "completed": 36, "failed": 0, "in_progress": 1, "remaining": 19, "repair_times": { "50": 1, "75": 1, "90": 1, "99": 5, "average": 1.3611111111111112, "max": 7, "min": 1 }, "total": 56 }, "incremental": { "completed": 8, "completed_bytes": 40000, "estimated_time": 0, "job_state": "success", "last_repair_ts": 0, "remaining": 0, "remaining_bytes": 0, "throughput": 1.0, "throughput_bytes": 5000, "total": 8, "total_bytes": 40000, "ttc_remaining": 777329 }, "subrange": { "completed": 28, "completed_bytes": 445648829, "estimated_time": 190, "job_state": "running", "last_repair_ts": 0, "remaining": 19, "remaining_bytes": 164736194, "throughput": 0.6829268292682927, "throughput_bytes": 11141095, "total": 48, "total_bytes": 610390023, "ttc_remaining": 777329 }, "details": { "OpsCenter.backup_reports": { "attempts": 0, "average_time": 0, "state": { "aborted": 0, "failure": 0, "pending": 4, "running": 0, "success": 0 }, "time": 0, "type": "incremental" }, }, }
- GET /{cluster_id}/repair-details¶
Gets a detailed list of current cycle’s repairs.
Opt. params: - keyspace – Limits results to only the specified keyspace. Optional.
- table – Limits results to only the specified table. Optional.
Returns a detailed list of every repair and its present status in the current repair cycle.
Example
curl "http://127.0.0.1:8888/Test_Cluster/repair-details?keyspace=myks&table=mytable"
[ { "attempts": 0, "executing": false, "ksname": "blackhat", "last_error": "", "node": "127.0.0.4", "repair_range": [ "0", "4611686018427387904" ], "size": 281480, "start_ts": 1492099955.84, "tables": [ "cc" ], "time": 1, "type": "subrange" }, { "attempts": 0, "executing": false, "ksname": "OpsCenter", "last_error": "", "node": "127.0.0.2", "size": 5000, "start_ts": 1492099102.985, "table": "settings", "time": 1, "type": "incremental" }, ]
Request Management Methods¶
- Request¶
Requests are the method that OpsCenter uses to track potentially long-running requests that must be completed asynchronously. When these potentially long-running API calls are made, opscenterd will immediately return a Request ID that can be used to look up the status of the request.
Once a Request is started, you can fetch the status information for it until opscenterd is restarted or a large number of Requests have been started.
A Request status takes the following form:
{ "id": ID, "state": STATE, "started": STARTED, "finished": FINISHED, "cluster_id": CLUSTER_ID, "details": DETAILS }
Data: - ID (string) – The unique UUID for this Request. When an operation is potentially long-running, opscented will return this ID immediately.
- STATE (string) – Either “running”, “success”, or “error”
- STARTED (int) – A unix timestamp representing when the Request started
- FINISHED (int) – A unix timestamp representing when the Request finished, or null if it has not finished yet
- CLUSTER_ID (string) – The name of the cluster that the Request is operating on
- DETAILS – Typically a string containing a status or error message, but
may be a dictionary in the form
{<subrequest_id>: <Request>}
when the Request holds a collection of subrequests.
Content Types: - JSON
- GET /request/{request_id}/status¶
Check the status of an asynchronous request sent to OpsCenter.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_id – The ID returned by the API call that triggered the request.
Return a dictionary describing the status of the request.
Example
curl http://127.0.0.1:8888/request/6b6b15aa-df8a-43f1-aab3-efce6b8589e4/status
{ "status": "running", "started": 1334856122, "error_message": null, "finished": null, "moves": [ { "status": null, "ip": "10.100.100.100", "old": "2", "new": "85070591730234615865843651857942052864" } ], "id": "6b6b15aa-df8a-43f1-aab3-efce6b8589e4" }
- cluster_id – The ID of a cluster returned from
- POST /request/{request_id}/cancel¶
Cancel an asynchronous request sent to OpsCenter. Not all requests can be cancelled.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_id – The ID returned by the API call that triggered the request.
Returns null.
Example
curl -X POST http://127.0.0.1:8888/request/6b6b15aa-df8a-43f1-aab3-efce6b8589e4/cancel
The request is canceled.
- cluster_id – The ID of a cluster returned from
- GET /{cluster_id}/request/{request_type}¶
List requests of a particular type. Default is the latest request of that type.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - request_type – Either “rolling-restart”, “restore” or “bulk-operations”
Query params: list_all – A boolean (0 or 1) indicating whether all of the requests should be returned or just the latest. Default is 0 (false).
Returns a Request ID. If list is true, then an array of Request IDs
Example
curl -X GET http://127.0.0.1:8888/Test_Cluster/request/rolling-restart
"8f4f71e7-65d3-41a7-bb1a-789af07dbd73"
curl -X GET http://127.0.0.1:8888/Test_Cluster/request/rolling-restart?list_all=1
[ "35dd37a5-4170-4694-9253-faa9532d47b6", "8f4f71e7-65d3-41a7-bb1a-789af07dbd73" ]
- cluster_id – The ID of a cluster returned from