Monitoring Starlight for Kafka
Starlight for Kafka exposes the following metrics in Prometheus format. You can monitor your clusters with these metrics.
The following types of metrics are available:
-
Counter: a cumulative metric that represents a single monotonically increasing counter. The value increases by default. You can reset the value to zero or restart your cluster.
-
Gauge: a metric that represents a single numerical value that can arbitrarily go up and down.
-
Histogram: a histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
-
Summary: similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
Starlight for Kafka metrics
The Starlight for Kafka metrics are exposed under /metrics`
at port 8000
along with Pulsar metrics. Use a different port by configuring the stats_server_port
system property.
Request metrics
Name | Type | Description |
---|---|---|
kop_server_ALIVE_CHANNEL_COUNT |
Gauge |
The number of alive request channels |
kop_server_ACTIVE_CHANNEL_COUNT |
Gauge |
The number of active request channels |
Request metrics
Name | Type | Description |
---|---|---|
kop_server_REQUEST_QUEUE_SIZE |
Gauge |
The number of requests in S4K request processing queue of the total request channel. |
kop_server_REQUEST_QUEUED_LATENCY |
Summary |
The requests queued latency calculated in milliseconds |
kop_server_REQUEST_PARSE_LATENCY |
Summary |
The requests parse latency from |
kop_server_REQUEST_LATENCY |
Summary |
The requests processing total latency for all Kafka APIs |
Response metrics
Name | Type | Description |
---|---|---|
kop_server_RESPONSE_BLOCKED_TIMES |
Counter |
The response blocked times due to waiting for process completes |
kop_server_RESPONSE_BLOCKED_LATENCY |
Summary |
The response blocked latency calculated in milliseconds |
Producer metrics
Name | Type | Description |
---|---|---|
kop_server_PRODUCE_ENCODE |
Summary |
The memory record encode latency |
kop_server_MESSAGE_PUBLISH |
Summary |
The message publish latency to Pulsar ManagedLedger |
kop_server_MESSAGE_QUEUED_LATENCY |
Summary |
The message queued latency in S4K message publish queue |
kop_server_BYTES_IN |
Counter |
The producer bytes in stats. |
kop_server_MESSAGE_IN |
Counter |
The producer message in stats. |
kop_server_BATCH_COUNT_PER_MEMORYRECORDS |
Gauge |
The number of batches in each memory records |
kop_server_PRODUCE_MESSAGE_CONVERSIONS |
Counter |
The producer message conversions in stats. |
Consumer metrics
Name | Type | Description |
---|---|---|
kop_server_PREPARE_METADATA |
Summary |
The prepare metadata latency in milliseconds before starting fetch from Pulsar ManagedLedger |
kop_server_TOTAL_MESSAGE_READ |
Summary |
The total message read latency in milliseconds in this fetch request |
kop_server_MESSAGE_READ |
Summary |
The message read latency in milliseconds for one cursor read entry request |
kop_server_FETCH_DECODE |
Summary |
The message decode latency in milliseconds |
kop_server_BYTES_OUT |
Counter |
The consumer bytes out stats |
kop_server_MESSAGE_OUT |
Counter |
The consumer message out stats |
kop_server_ENTRIES_OUT |
Counter |
The consumer entries out stats |
kop_server_CONSUME_MESSAGE_CONVERSIONS |
Counter |
The consumer message conversions in stats |
S4K event metrics
Name | Type | Description |
---|---|---|
kop_server_KOP_EVENT_QUEUE_SIZE |
Gauge |
The total number of events in S4K event processing queue. |
kop_server_KOP_EVENT_QUEUED_LATENCY |
Summary |
The events queued latency calculated in milliseconds. |
kop_server_KOP_EVENT_LATENCY |
Summary |
The events processing total latency for all S4K event types. |
What’s next?
For more on Starlight for Kafka, see: