Managing Events and Alerts
Using these methods, you can get information about log events, such as node compactions and repairs triggered through OpsCenter, and configure alert thresholds for a number of database metrics.
Event and Alert Methods | URL |
---|---|
Retrieve OpsCenter events. |
|
Retrieve configured alert rules. |
|
Retrieve a specific alert rule. |
|
Create a new alert rule. |
|
Update an alert rule. |
|
Delete an alert rule. |
|
Retrieve active alerts. |
Event Methods
GET /{cluster_id}/events
Retrieve historical events logged by OpsCenter.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
Query params:
-
count*: The number of events to return. Defaults to
10
.-
timestamp: A timestamp specifying the point in time to start retrieving events. Specified as a unix timestamp in microseconds. Defaults to the current time.
-
reverse: A boolean (
0
or1
) indicating whether to retrieve events in reverse order. Defaults to1
(true
). Events are retrieved starting from the time specified by the timestamp and going backward in time untilcount
events are found or there are no more events to retrieve.
-
Returns a list of dictionaries where each dictionary represents an event. An event dictionary contains properties describing that event.
Example:
curl http://127.0.0.1:8888/Test_Cluster/events?count=1
Output:
{
"action": 28,
"api_source_ip": 192.168.1.12,
"event_source": "OpsCenter",
"level": 1,
"level_str": "INFO",
"message": "Restarting node 192.168.100.3",
"source_node": 192.168.100.3,
"success": null,
"target_node": null,
"time": "1334768517145625",
"user": joe
}
Alert Methods
GET /{cluster_id}/alert-rules
Retrieve a list of configured alert rules in OpsCenter.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
Returns a list of [response-alert-rule] objects.
AlertRule
{
"id": <value>,
"type": <value>,
"threshold": <value>,
"comparator": <value>,
"duration": <value>,
"notify_interval": <value>,
"enabled": <value>,
"metric": <value>,
"cf": <value>,
"item": <value>,
"dc": <value>
}
This table describes the property values of an alertrule
object:
Property |
Type |
Description of Values |
---|---|---|
id |
String |
A unique ID that references an alert rule. Use only for retrieving alert rules. |
type |
String |
The event or metric aggregation that triggers an alert. Accepted values include rolling-avg, cluster-balance, and node-down. This field is not editable. |
threshold |
Float |
The metric boundary that triggers an alert when the threshold is crossed. Applicable only when the type is |
comparator |
String Optional. |
Values are |
duration |
Int |
How long (in minutes) the problem continues before firing the alert. |
notify_interval |
Int |
How often (in minutes) to repeat the alert. Use |
enabled |
Int |
The state of the alert. Values are |
metric |
String |
A key from |
cf |
String Optional. |
The table to monitor if the metric property is one of the |
item |
String Optional. |
The device to monitor if the metric is one of the |
dc |
String Optional. |
The name of the data center that contains nodes to be monitored. If omitted, all nodes are monitored. |
Example:
curl http://127.0.0.1:8888/Test_Cluster/alert-rules
Output:
[
{
"comparator": ">",
"dc": "us-east",
"duration": 1.0,
"enabled": 1,
"id": "e0c356c7-62ff-4aa8-9b17-e305f101b69a",
"metric": "write-latency",
"notify_interval": 1.0,
"threshold": 10000.0,
"type": "rolling-avg"
},
...
]
GET /{cluster_id}/alert-rules/{alert_id}
Retrieve a specific alert rule.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
-
alert_id: A UUID that identifies a specific alert rule and has the value of an id property returned by GET /{cluster_id}/alert-rules`.
Returns an AlertRule.
Example:
curl http://127.0.0.1:8888/Test_Cluster/alert-rules/e0c356c7-62ff-4aa8-9b17-e305f101b69a
Output:
{
"comparator": ">",
"dc": "us-east",
"duration": 1.0,
"enabled": 1,
"id": "e0c356c7-62ff-4aa8-9b17-e305f101b69a",
"metric": "write-latency",
"notify_interval": 1.0,
"threshold": 10000.0,
"type": "rolling-avg"
}
POST /{cluster_id}/alert-rules
Create a new alert rule.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
Body: A dictionary in the format of alertrule
describing the
alert to create.
Response: 201: Alert rule was created successfully.
Returns the ID of the newly created alert.
Example:
curl -X POST
http://127.0.0.1:8888/Test_Cluster/alert-rules
-d '{
"comparator": ">",
"dc": "",
"duration": 60.0,
"enabled": 1,
"metric": "heap-used",
"notify_interval": 5.0,
"threshold": 6291456000.0,
"type": "rolling-avg"
}'
Output:
"b375fd3e-3908-4be5-ae37-d8f3b8699a9f"
PUT /{cluster_id}/alert-rules/{alert_id}
Update an existing alert rule.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
-
alert_id: A UUID that identifies a specific alert rule and has the value of an id property returned by`get-alert-rules`.
Body: A dictionary of fields from alertrule
to update.
Response: 200: Alert rule updated successfully.
Example:
curl -X PUT
http://127.0.0.1:8888/Test_Cluster/alert-rules/b375fd3e-3908-4be 5-ae37-d8f3b8699a9f
-d '{"duration": 120.0}'
DELETE /{cluster_id}/alert-rules/{alert_id}
Delete an existing alert rule.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
-
alert_id: A UUID that identifies a specific alert rule and has the value of an id property returned by`get-alert-rules`.
Response: 200: Alert rule removed successfully.
Example:
curl -X DELETE
http://127.0.0.1:8888/Test_Cluster/alert-rules/b375fd3e-3908-4be 5-ae37-d8f3b8699a9f
GET /{cluster_id}/alerts/fired
Get all alerts which are currently fired.
Path arguments:
-
cluster_id: The ID of a cluster returned from GET /cluster-configs.
Returns a list of alerts that have been triggered. Each item in the list is a dictionary describing the triggered alert.
Example:
curl http://127.0.0.1:8888/Test_Cluster/alerts/fired
Output:
[
{
"alert_rule_id": "ca4cf071-03bd-486a-a8be-428e6cd7218a",
"current_value": 31676303.333333332,
"first_fired": 1336669233,
"node": "10.11.12.150"
},
{
"alert_rule_id": "ca4cf071-03bd-486a-a8be-428e6cd7218a",
"current_value": 28380117.5,
"first_fired": 1336669233,
"node": "10.11.12.152"
}
]