Export Astra DB metrics to a third party
Enterprises depend on the ability to view database health metrics in centralized systems along with their other software metrics. The Astra DB Metrics feature lets you forward Astra DB database health metrics to an external third-party metrics system. We refer to the recipient of the exported metrics as the destination system.
The functionality provided by the Astra DB Metrics feature is often referred to as:
-
Observability
-
External monitoring
-
Third-party metrics
-
Prometheus monitoring integration
At this time, Astra DB Metrics supports exporting health metrics from Astra DB serverless databases to Prometheus. You can also use Grafana or Grafana Cloud as a visualization tool.
You’ll configure the export of Astra DB health metrics in the payload of the following DevOps v2 API call:
POST /v2/databases/{databaseId}/telemetry/metrics
The Astra DB Metrics feature:
|
Benefits
The Astra DB Metrics feature allows you to take full control of forwarding Astra DB database health metrics to your preferred observability system. The functionality is intended for developers, site reliability engineers (SREs), IT managers, and product owners.
Ingesting database health metrics into your system gives you the ability to craft your own alerting actions and dashboards based on your service level objectives and retention requirements. While you can continue to view metrics displayed in Astra DB console via each database’s Health tab, forwarding metrics to a third-party app gives you a more complete view of all metrics being tracked, across all your products.
This enhanced capability can provide your team with broader insights into historical performance, issues, and areas for improvement.
The exported Astra DB health metrics are nearly real-time when consumed externally. You can find the source-of-truth view of your metric values in the Astra DB console’s Health dashboard. |
Prerequisites
-
If you haven’t already, create a serverless database using the Astra DB console.
Keep track of your
databaseId
. You’ll specify it in the DevOpsPOST
API call for/v2/databases/{databaseId}/telemetry/metrics
. You can find thedatabaseId
on the Astra DB console’s dashboard.Example:
-
Generate an application token so you can authenticate your account in the DevOps API.
If you don’t have a current token, see Manage application tokens.
Example:
When using the DevOps API, pass in the auth token’s value in the call’s Header.
-
Ensure you have permission to use the DevOps v2 API for enabling third-party metrics. See Roles and permissions in this topic.
You’ll need an existing destination system to receive the forwarded metrics. Currently, Prometheus and Grafana / Grafana Cloud are supported. |
Pricing
With an Astra DB PAYG or Enterprise plan, there is no additional cost to using Astra DB Metrics, outside of standard data transfer charges. Exporting third-party metrics is not available on the Astra DB Free Tier.
Metrics monitoring may incur costs at the destination system. Consult the destination system’s documentation for its pricing information.
Roles and permissions
The following Astra DB roles can export third-party metrics:
-
Organization Administrator (recommended)
-
Database Administrator
-
Service Account Administrator
-
User Administrator
The required db-manage-thirdpartymetrics
permission is automatically assigned to those roles.
If you create a custom role in Astra DB, be sure to assign db-manage-thirdpartymetrics
permission to the custom role.
Prometheus setup
For information about setting up Prometheus as the destination of the forwarded Astra DB database metrics, see the Prometheus Getting Started documentation.
|
For Prometheus,
|
After completing those steps in your Prometheus environment, verify it by sending a POST
request to the remote write endpoint. For an example test client, which also verifies that ingress is setup properly, see:
promremote
is a Prometheus remote write client written in Go.
Database metrics forwarded by Astra DB
Here’s a list of database metrics forwarded by the Astra DB Metrics feature.
-
rate_limited_requests_total
- A counter, it’s the number of operations that failed due to an Astra DB rate limit. You can request that rate limits are increased for your Astra DB databases. Take a rate, such as 5 minutes (5m), and alert if the value is > 0. -
read_requests_failures_total
- A counter, it’s the number of reads that failed. Cassandra drivers will retry failed operations, but significant failures can be problematic. Take a rate, such as 5m, and alert if the value is > 0.Warn
alert on low amount.High
alert on larger amounts; determine potentially as a percentage of read throughput. -
read_requests_timeouts_total
- Timeouts happen when operations against the database take longer than the server side timeout. Take a rate, such as 5m, and alert if the value is > 0. -
read_requests_unavailables_total
- Occurs when the service is not available to complete a specific request. Take a rate, such as 5m, and alert if the value is > 0. -
write_requests_failures_total
- A counter, it’s the number of writes that failed. Cassandra drivers will retry failed operations, but significant failures can be problematic. Take a rate, such as 5m, and alert if the value is > 0.Warn
alert on low amount.High
alert on larger amounts; determine potentially as a percentage of read throughput. -
write_requests_timeouts_total
- Timeouts occur when operations take longer than the server side timeout. Take a rate, such as 5m, and compare withwrite_requests_failures_total
. -
write_requests_unavailables_total
- Unavailable errors occur when the service is not available to service a particular request. Take a rate, such as 5m, and compare withwrite_requests_failures_total
. -
range_requests_failures_total
- A counter, it’s the number of range reads that failed. Cassandra drivers retry failed operations, but significant failures can be problematic. Take a rate, such as 5m, and alert if the value is > 0.Warn
alter on low amount.High
alert on larger amounts; determine potentially as a percentage of read throughput. -
range_requests_timeouts_total
- Timeouts are a subset of total failures. Use this metric to understand if failures are due to timeouts. Take a rate, such as 5m, and compare withrange_requests_failures_total
. -
range_requests_unavailables_total
- Unavailable errors are a subset of total failures. Use this metric to understand if failures are due to timeouts. Take a rate, such as 5m, and compare withrange_requests_failures_total
. -
write_latency_seconds_count
- Take rate for write throughput. Alert based on your application Service Level Objective (business requirement). -
write_latency_seconds_bucket
- Take percentiles write for latency. Alert based on your application Service Level Objective (business requirement). -
write_requests_mutation_size_bytes_bucket
- Take percentiles to see how big your writes are over time. -
read_latency_seconds_count
- Take the rate for read throughput. Alert based on your application Service Level Objective (business requirement). -
read_latency_seconds_bucket
- Take percentiles read for latency. Alert based on your application Service Level Objective (business requirement). -
range_latency_seconds_count
- Take the rate for range read throughput. Alert based on your application Service Level Objective (business requirement). -
range_latency_seconds_bucket
- Take percentiles range read for latency. Alert based on your application Service Level Objective (business requirement).
For more information about Prometheus metric types, see this topic. |
POST configuration payload
The configuration payload (JSON) for the POST /v2/databases/{databaseId}/telemetry/metrics
call depends on which destination you’ll use. Currently we support Prometheus remote_write
.
To ensure that metrics are enabled for your destination app, provide the relevant properties.
Prometheus
With a required top-level key of prometheus_remote
, the POST payload:
-
prometheus_remote
-
endpoint
-
auth_strategy
-
token
-
user
-
password
-
For auth_strategy
, specify basic
or bearer
, depending on your Prometheus remote_write
auth type.
-
If you specified
"auth_strategy": "bearer"
, provide your Prometheus token. Do not includeuser
orpassword
in the POST request payload. -
If you specified
"auth_strategy": "basic"
, provide your Prometheususer
andpassword
. Do not includetoken
.
Example payloads:
{
"prometheus_remote": {
"endpoint": "https://prometheus.example.com/api/prom/push",
"auth_strategy" : "bearer",
"token" : "lSAYp9oLtdAa9ajasoNNS999"
}
Or:
{
"prometheus_remote": {
"endpoint": "https://prometheus.example.com/api/prom/push",
"auth_strategy" : "basic",
"password" : "myPromPassword",
"user" : "myPromUsername"
}
The |
In Configure third-party metrics with DevOps API, see the JSON example shown with the POST
call’s --data
parameter.
As noted above, for Prometheus,
|
Configure third-party metrics with DevOps API
Use the following POST
to export metrics to a third-party app such as Prometheus.
POST /v2/databases/{databaseId}/telemetry/metrics
See the following sections for curl
examples. If you prefer, use Postman with raw
JSON in the body.
-
In the
--header
, useBearer
and specify your token ID to authenticate with the DevOps v2 API.If you don’t have a current token, see Manage application tokens.
-
Specify your database ID.
See the Astra DB console Dashboard for its value. You can define a variable such as
$DB_ID
, set it to your databaseId value, and then use the variable in acurl
command. -
In the request payload, specify the destination’s
--data
properties. Only specify the properties that are relevant to your environment, based on your Prometheus auth strategy; omit the other properties.With a required top-level key of
prometheus_remote
, the POST payload:-
prometheus_remote
-
endpoint
-
auth_strategy
-
token
-
user
-
password
For
auth_strategy
, specifybasic
orbearer
, depending on your Prometheusremote_write
auth type.
-
-
If you specified
"auth_strategy": "bearer"
, provide your Prometheus token. Do not includeuser
orpassword
in the POST request payload. -
If you specified
"auth_strategy": "basic"
, provide your Prometheususer
andpassword
. Do not includetoken
.
-
curl --request POST \
--url 'https://api.astra.datastax.com/v2/databases/$DB_ID/telemetry/metrics' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <application_token>' \
--data '{
"prometheus_remote": {
"endpoint": "Enter a full HTTP or HTTPS adddress and path for prometheus endpoint",
"auth_strategy": "bearer or basic",
"token": "If auth_strategy bearer, enter Prom Remote Write auth token",
"user": "If auth_strategy basic, enter Prom username",
"password": "If auth_strategy basic, enter Prom password"
}
}'
Examples with POST request payload:
curl --request POST \
--url 'https://api.astra.datastax.com/v2/databases/$DB_ID/telemetry/metrics' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <application_token>' \
--data '{
"prometheus_remote": {
"endpoint": "https://prometheus.example.com/api/prom/push",
"auth_strategy": "bearer",
"token": "lSAYp9oLtdAa9ajasoNNS999"
}
}'
Or:
curl --request POST \
--url 'https://api.astra.datastax.com/v2/databases/$DB_ID/telemetry/metrics' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <application_token>' \
--data '{
"prometheus_remote": {
"endpoint": "https://prometheus.example.com/api/prom/push",
"auth_strategy": "basic",
"user": "myPromUsername",
"password": "myPromPassword"
}
}'
202 OK
Or one of the following:
400 Bad request.
403 The user is forbidden to perform the operation.
404 The specified resource was not found.
500 A server error occurred.
Example:
{
"errors": [
{
"description": "The name of the environment must be provided",
"internalCode": "a1012",
"internalTxId": "103B-A018-3898-0ABF"
}
]
}
Replace the null
placeholders in the JSON template with your values.
Get metrics configuration for an Astra DB database
Retrieve third-party metrics configuration for an Astra DB database:
curl --request GET \
--url 'https://api.astra.datastax.com/v2/databases/$DB_ID/telemetry/metrics' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <application_token>'
200 OK
Example:
{
"prometheus_remote": {
"endpoint": "https://prometheus.example.com/api/prom/push",
"auth_strategy": "basic",
"user": "myPromUsername",
"password": "myPromPassword"
}
}
Or one of the following:
400 Bad request.
403 The user is forbidden to perform the operation.
404 The specified resource was not found.
500 A server error occurred.
{
"errors": [
{
"description": "The name of the environment must be provided",
"internalCode": "a1012",
"internalTxId": "103B-A018-3898-0ABF"
}
]
}
Visualize exported Astra DB metrics with Grafana Cloud
This section explains how to configure Grafana Cloud to consume Astra DB (serverless) health metrics.
You’ll need a Grafana Cloud account. They offer a Free plan with 14-day retention. For details, see Grafana pricing.
Initial steps in Grafana Cloud
The following initial steps occur before submitting the POST /v2/telemetry/metrics
payload described previously in this topic.
-
On login to Grafana Cloud, select + Connect data from the home page.
-
Select the Custom Prometheus metrics section that includes the Prometheus icon.
-
You can accept the default selections, or make edits as needed. Provide a name to the API Key (such as AstraDB_PS) and click Create API Key.
-
The config file is generated. Here’s an example - your values will be different:
cat << EOF > ./agent-config.yaml global: scrape_interval: 60s scrape_configs: - job_name: node static_configs: - targets: ['localhost:9100'] remote_write: - url: https://prometheus-prod-10-prod-us-central-0.grafana.net/api/prom/push basic_auth: username: 412XXX password: eyJrIjoiMmE1ZTY4YWRhY2ZmNmZlMjllZmY3ZjczYWQ0NzRiZjNlNTE1NTVkMCIsIm4iOiJBc3RyYURCX1BTIiwiaWQiOjYzOTQXXX= EOF
DevOps config via Postman & Grafana Cloud followup
To configure and publish metrics from Astra DB using the DevOps API, follow these steps. We’ll use Postman and have a bearer token configured.
To publish metrics, create a POST
request in Postman:
https://api.astra.datastax.com/v2/databases/{databaseId}/telemetry/metrics
In the Body, set the parameters to the values that you retrieved from Grafana Cloud. Example:
{
"prometheus_remote": {
"endpoint": "https://prometheus-prod-10-prod-us-central-0.grafana.net/api/prom/push",
"auth_strategy": "basic",
"user": "412XXX",
"password": "eyJrIjoiMmE1ZTY4YWRhY2ZmNmZlMjllZmY3ZjczYWQ0NzRiZjNlNTE1NTVkMCIsIm4iOiJBc3RyYURCX1BTIiwiaWQiOjYzOTQXXX=
}
}
The POST
response should return a 202
on success.
Now, switch back to Grafana Cloud:
-
Select the option to Create a New Dashboard.
-
Select Add a new panel and select the Data Source as
grafanacloud-<YourUserId>-prom
. Example: -
If configured correctly, you should see the Astra DB Metrics under the Metrics Browser in Grafana Cloud. Example:
-
Now you can select the metrics that you want to visualize in Grafana Cloud. The Dashboard panel displays the charts.
Alternative approach: import from Astra DB Health to Grafana Cloud
This alternative approach will explore an import option from Astra DB health to your Grafana Cloud instance. You will still need to complete the steps listed above:
Then continue with the steps below. |
-
Login to Astra DB console.
-
Select the database you want to ultimately monitor in Grafana Cloud by first navigating to your database’s Health tab.
-
Click on DSE Cluster Condensed. Example:
-
Click the Share icon:
-
Then select the Export tab:
-
Click View JSON and then Copy to Clipboard:
-
Make the following edits to the copied JSON.
Replace all references to
coordinator_…{tenant= … }
withastra_coordinator
and remove the tenant references. For example, in the following expression:"expr": "histogram_quantile(.99, sum(rate(coordinator_write_requests_mutation_size_bytes_bucket{tenant='${__user.login}'}[$__rate_interval])) by (le))",
You would replace that expression with:
"expr": "histogram_quantile(.99, sum(rate(astra_coordinator_write_requests_mutation_size_bytes_bucket{}[$__rate_interval])) by (le))",
-
Now switch over to your Grafana Cloud instance. Click the Create option, and then click Import from the menu.
-
Upload or paste in your edited JSON.
-
You can change the name. Example:
-
Once imported, all your Astra DB health charts will auto-populate in Grafana Cloud. Example:
Now you can use your own Grafana Cloud instance to monitor the Astra DB database’s health via its metrics.