Merge metrics MBean

The merge metrics MBean tracks the time that Apache Solr™/Apache Lucene® spend on merging segments that accumulate on disk. Segments are files that store new documents and are a self-contained index. When data is deleted, Lucene does not remove it, but instead marks documents as deleted. For example, during the merging process, Lucene copies the data from 100 segment files into a single new file. Documents that are marked deleted are not included in the new segment files. Next, Lucene removes the 100 old segment files, and the single new file holds the index on disk.

After segments are written to disk, they are immutable.

In a high throughput environment, a single segment file is rare. Typically, there are several files and Lucene runs the merge metric operation concurrently with inserts and updates of the data using a merge policy and merge schedule.

Merge operations are costly and can impact the performance of CQL queries. A huge merge operation can cause a sudden increase in query execution time.

Main operational phases

The main phases of a merge operation on the index are:

INIT

How long it takes to initialize the merge process.

EXECUTE

How long it takes to execute the merge process.

WARM

How long it takes to warm up segments to speed up cold queries.

To get merge metrics, insert one of the phases of the merge operation and select a phase, such as EXECUTE.

WARM time is part of EXECUTE time:

EXECUTE time = WARM time + other operations

For example, if the EXECUTE phase is 340 ms, and the WARM phase is 120 ms, then other operations account for the remaining 220 ms (340 ms - 120 ms = 220 ms).

The merge metrics MBean operations are:

  • getRecordedLatencyCount

  • getLatencyPercentile

  • getAverageLatency

  • resetLatency

  • resetLatencies

JConsole showing the merge metrics MBean operations.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com