Garbage collection pauses
Troubleshooting GC pauses of more than a second, or multiple pauses within a second that add up to a large fraction of that second.
Troubleshooting GC pauses of more than a second, or multiple pauses within a second that add up to a large fraction of that second.
Garbage collection (GC) is the process by which Java removes data that is no longer needed from memory. A garbage collection pause, also known as a stop-the-world event, happens when a region of memory is full and the JVM requires space to continue. During a pause all operations are suspended. Because a pause affects networking, the node can appear as down to other nodes in the cluster. Additionally, any Select and Insert statements will wait, which increases read and write latencies. Any pause of more than a second, or multiple pauses within a second that add to a large fraction of that second, should be avoided. The basic cause of the problem is that the rate of data stored in memory outpaces the rate at which data can be removed.
The two most common log messages that indicate excessive pausing is occurring are:
INFO [ScheduledTasks:1] 2013-03-07 18:44:46,795 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1835 ms for 3 collections, 2606015656 used; max is 10611589120
INFO [ScheduledTasks:1] 2013-03-07 19:45:08,029 GCInspector.java (line 122) GC for ParNew: 9866 ms for 8 collections, 2910124308 used; max is 6358564864
- If the problem is recent, check for any recent applications changes.
- Excessive tombstone activity: often caused by heavy delete workloads.
- Large row updates or large batch updates: reduce the size of the individual write below 1 Mb (at the most).
- Extremely wide rows: manifests as problems in repairs, selects, caching, and elsewhere.
Server side factors include:
- Missing or strange JVM parameters. Compare those set to the default settings shipped with latest product version.
- JNA not found.
- Swap enabled.