Managing Events and Alerts¶
Using these methods, you can get information about log events, such as node compactions and repairs triggered through OpsCenter, and configure alert thresholds for a number of database metrics.
Event and Alert Methods | URL |
---|---|
Retrieve OpsCenter events. | GET /{cluster_id}/events |
Alert Methods | |
Retrieve configured alert rules. | GET /{cluster_id}/alert-rules |
Retrieve a specific alert rule. | GET /{cluster_id}/alert-rules/{alert_id} |
Create a new alert rule. | POST /{cluster_id}/alert-rules/ |
Update an alert rule. | PUT /{cluster_id}/alert-rules/{alert_id} |
Delete an alert rule. | DELETE /{cluster_id}/alert-rules/{alert_id} |
Retrieve active alerts. | GET /{cluster_id}/alerts/fired |
Event Methods¶
- GET /{cluster_id}/events¶
Retrieve historical events logged by OpsCenter.
Path arguments: cluster_id – The ID of a cluster returned from
GET /cluster-configs
.Query params: - count – The number of events to return. Defaults to 10.
- timestamp – A timestamp specifying the point in time to start retrieving events. Specified as a unix timestamp in microseconds. Defaults to the current time.
- reverse – A boolean (0 or 1) indicating whether to retrieve events in reverse order. Defaults to 1 (true). Events are retrieved starting from the time specified by the timestamp and going backward in time until ‘count’ events are found or there are no more events to retrieve.
Returns a list of dictionaries where each dictionary represents an event. An event dictionary contains properties describing that event.
Example
curl http://127.0.0.1:8888/Test_Cluster/events?count=1
Output:
{ "action": 28, "api_source_ip": 192.168.1.12, "event_source": "OpsCenter", "level": 1, "level_str": "INFO", "message": "Restarting node 192.168.100.3", "source_node": 192.168.100.3, "success": null, "target_node": null, "time": "1334768517145625", "user": joe }
Alert Methods¶
- GET /{cluster_id}/alert-rules¶
Retrieve a list of configured alert rules in OpsCenter.
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Returns a list of AlertRule objects.
- AlertRule¶
{ "id": <value>, "type": <value>, "threshold": <value>, "comparator": <value>, "duration": <value>, "notify_interval": <value>, "enabled": <value>, "metric": <value>, "cf": <value>, "item": <value>, "dc": <value> }
This table describes the property values of an AlertRule object:
Property Type Description of Values id String A unique ID that references an alert rule. Use only for retrieving alert rules. type String The event or metric aggregation that triggers an alert. Accepted values include rolling-avg, cluster-balance, and node-down. This field is not editable. threshold Float The metric boundary that triggers an alert when the threshold is crossed. Applicable only when the type is rolling-avg. comparator String Optional. Values are <
or>
.duration Int How long (in minutes) the problem continues before firing the alert. notify_interval Int How often (in minutes) to repeat the alert. Use 0 for a single notification. enabled Int The state of the alert. Values are 0 (disabled) or 1 (enabled). metric String A key from list of metrics. This field is only valid if the type is rolling-avg. cf String Optional. The table to monitor if the metric property is one of the Table Metrics Keys. item String Optional. The device to monitor if the metric is one of the Operating System Metrics Keys. dc String Optional. The name of the data center that contains nodes to be monitored. If omitted, all nodes will be monitored. Example
curl http://127.0.0.1:8888/Test_Cluster/alert-rules
Output:
[ { "comparator": ">", "dc": "us-east", "duration": 1.0, "enabled": 1, "id": "e0c356c7-62ff-4aa8-9b17-e305f101b69a", "metric": "write-latency", "notify_interval": 1.0, "threshold": 10000.0, "type": "rolling-avg" }, ... ]
- GET /{cluster_id}/alert-rules/{alert_id}¶
Retrieve a specific alert rule.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - alert_id – A UUID that identifies a specific alert rule and has the value
of an id property returned by
GET /{cluster_id}/alert-rules
.
Returns an AlertRule.
Example
curl http://127.0.0.1:8888/Test_Cluster/alert-rules/e0c356c7-62ff-4aa8-9b17-e305f101b69a
Output:
{ "comparator": ">", "dc": "us-east", "duration": 1.0, "enabled": 1, "id": "e0c356c7-62ff-4aa8-9b17-e305f101b69a", "metric": "write-latency", "notify_interval": 1.0, "threshold": 10000.0, "type": "rolling-avg" }
- cluster_id – The ID of a cluster returned from
- POST /{cluster_id}/alert-rules¶
Create a new alert rule.
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Body: A dictionary in the format of AlertRule describing the alert to create. Responses: 201 – Alert rule was created successfully Returns the ID of the newly created alert.
Example:
curl -X POST http://127.0.0.1:8888/Test_Cluster/alert-rules -d '{ "comparator": ">", "dc": "", "duration": 60.0, "enabled": 1, "metric": "heap-used", "notify_interval": 5.0, "threshold": 6291456000.0, "type": "rolling-avg" }'
Output:
"b375fd3e-3908-4be5-ae37-d8f3b8699a9f"
- PUT /{cluster_id}/alert-rules/{alert_id}¶
Update an existing alert rule.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - alert_id – A UUID that identifies a specific alert rule and has the value of
an id property returned by
GET /{cluster_id}/alert-rules
.
Body: A dictionary of fields from AlertRule to update.
Responses: 200 – Alert rule updated successfully
Example:
curl -X PUT http://127.0.0.1:8888/Test_Cluster/alert-rules/b375fd3e-3908-4be5-ae37-d8f3b8699a9f -d '{"duration": 120.0}'
- cluster_id – The ID of a cluster returned from
- DELETE /{cluster_id}/alert-rules/{alert_id}¶
Delete an existing alert rule.
Path arguments: - cluster_id – The ID of a cluster returned from
GET /cluster-configs
. - alert_id – A UUID that identifies a specific alert rule and has the value of
an id property returned by
GET /{cluster_id}/alert-rules
.
Responses: 200 – Alert rule removed successfully
Example:
curl -X DELETE http://127.0.0.1:8888/Test_Cluster/alert-rules/b375fd3e-3908-4be5-ae37-d8f3b8699a9f
- cluster_id – The ID of a cluster returned from
- GET /{cluster_id}/alerts/fired¶
Get all alerts which are currently fired.
Path arguments: cluster_id – The ID of a cluster returned from GET /cluster-configs
.Returns a list of alerts that have been triggered. Each item in the list is a dictionary describing the triggered alert.
Example:
curl http://127.0.0.1:8888/Test_Cluster/alerts/fired
Output:
[ { "alert_rule_id": "ca4cf071-03bd-486a-a8be-428e6cd7218a", "current_value": 31676303.333333332, "first_fired": 1336669233, "node": "10.11.12.150" }, { "alert_rule_id": "ca4cf071-03bd-486a-a8be-428e6cd7218a", "current_value": 28380117.5, "first_fired": 1336669233, "node": "10.11.12.152" } ]