Monitoring Starlight for Kafka
Starlight for Kafka exposes the following metrics in Prometheus format. You can monitor your clusters with these metrics.
The following types of metrics are available:
-
Counter: a cumulative metric that represents a single monotonically increasing counter. The value increases by default. You can reset the value to zero or restart your cluster.
-
Gauge: a metric that represents a single numerical value that can arbitrarily go up and down.
-
Histogram: a histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
-
Summary: similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
Starlight for Kafka metrics
The Starlight for Kafka metrics are exposed under /metrics`
at port 8000
along with Pulsar metrics. Use a different port by configuring the stats_server_port
system property.
Request metrics
Name | Type | Description |
---|---|---|
kop_server_ALIVE_CHANNEL_COUNT |
Gauge |
The number of alive request channels. |
kop_server_ACTIVE_CHANNEL_COUNT |
Gauge |
The number of active request channels. |
Request metrics
Name | Type | Description |
---|---|---|
kop_server_REQUEST_QUEUE_SIZE |
Gauge |
The number of requests in S4K request processing queue of the total request channel. |
kop_server_REQUEST_QUEUED_LATENCY |
Summary |
The requests queued latency calculated in milliseconds. Available labels: request (ApiVersions, Metadata, Produce, FindCoordinator, ListOffsets, OffsetFetch, OffsetCommit, Fetch, JoinGroup, SyncGroup, Heartbeat, LeaveGroup, DescribeGroups, ListGroups, DeleteGroups, SaslHandshake, SaslAuthenticate, CreateTopics, InitProducerId, AddPartitionsToTxn, AddOffsetsToTxn, TxnOffsetCommit, EndTxn, WriteTxnMarkers, DescribeConfigs, DeleteTopics). |
kop_server_REQUEST_PARSE_LATENCY |
Summary |
The requests parse latency from |
kop_server_REQUEST_LATENCY |
Summary |
The requests processing total latency for all Kafka APIs. Available labels: request (ApiVersions, Metadata, Produce, FindCoordinator, ListOffsets, OffsetFetch, OffsetCommit, Fetch, JoinGroup, SyncGroup, Heartbeat, LeaveGroup, DescribeGroups, ListGroups, DeleteGroups, SaslHandshake, SaslAuthenticate, CreateTopics, InitProducerId, AddPartitionsToTxn, AddOffsetsToTxn, TxnOffsetCommit, EndTxn, WriteTxnMarkers, DescribeConfigs, DeleteTopics). |
Response metrics
Name | Type | Description |
---|---|---|
kop_server_RESPONSE_BLOCKED_TIMES |
Counter |
The response blocked times due to waiting for process completes. |
kop_server_RESPONSE_BLOCKED_LATENCY |
Summary |
The response blocked latency calculated in milliseconds. |
Producer metrics
Name | Type | Description |
---|---|---|
kop_server_PRODUCE_ENCODE |
Summary |
The memory record encode latency. |
kop_server_MESSAGE_PUBLISH |
Summary |
The message publish latency to Pulsar ManagedLedger. |
kop_server_MESSAGE_QUEUED_LATENCY |
Summary |
The message queued latency in S4K message publish queue. |
kop_server_BYTES_IN |
Counter |
The producer bytes in stats. Available labels: topic, partition.
|
kop_server_MESSAGE_IN |
Counter |
The producer message in stats. Available labels: topic, partition.
|
kop_server_BATCH_COUNT_PER_MEMORYRECORDS |
Gauge |
The number of batches in each memory records. |
kop_server_PRODUCE_MESSAGE_CONVERSIONS |
Counter |
The producer message conversions in stats. Available labels: topic, partition.
|
Consumer metrics
Name | Type | Description |
---|---|---|
kop_server_PREPARE_METADATA |
Summary |
The prepare metadata latency in milliseconds before starting fetch from Pulsar ManagedLedger. |
kop_server_TOTAL_MESSAGE_READ |
Summary |
The total message read latency in milliseconds in this fetch request. |
kop_server_MESSAGE_READ |
Summary |
The message read latency in milliseconds for one cursor read entry request. |
kop_server_FETCH_DECODE |
Summary |
The message decode latency in milliseconds. |
kop_server_BYTES_OUT |
Counter |
The consumer bytes out stats. Available labels: topic, partition, group.
|
kop_server_MESSAGE_OUT |
Counter |
The consumer message out stats. Available labels: topic, partition, group.
|
kop_server_ENTRIES_OUT |
Counter |
The consumer entries out stats. Available labels: topic, partition, group.
|
kop_server_CONSUME_MESSAGE_CONVERSIONS |
Counter |
The consumer message conversions in stats. Available labels: topic, partition.
|
S4K event metrics
Name | Type | Description |
---|---|---|
kop_server_KOP_EVENT_QUEUE_SIZE |
Gauge |
The total number of events in S4K event processing queue. |
kop_server_KOP_EVENT_QUEUED_LATENCY |
Summary |
The events queued latency calculated in milliseconds. Available labels: event (DeleteTopicsEvent, BrokersChangeEvent, ShutdownEventThread). |
kop_server_KOP_EVENT_LATENCY |
Summary |
The events processing total latency for all S4K event types. Available labels: event (DeleteTopicsEvent, BrokersChangeEvent, ShutdownEventThread). |
What’s next?
For more on Starlight for Kafka, see: