2015年1月6日星期二

sun.nio.ch.Util managed direct buffer and thread reusing

Background

As I mentioned in previous blog, there is a memory fragmentation problem within our application, and tunning the JVM parameters can resolve the problem, but it doesn't answer the question who is creating so many direct buffers and use them for short time? Because it's against the recommendation of ByteBuffer.allocateDirect():
It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations.

Analysis

To answer this question, I created a simple program to monitor the caller of ByteBuffer.allocateDirect(int) through JDI, and got following output (some stack are omitted):

java.nio.ByteBuffer.allocateDirect(int): count=124, size=13463718
  sun.nio.ch.Util.getTemporaryDirectBuffer(int): count=124, size=13463718
    sun.nio.ch.IOUtil.write(java.io.FileDescriptor, java.nio.ByteBuffer, long, sun.nio.ch.NativeDispatcher): count=1, size=447850
      sun.nio.ch.SocketChannelImpl.write(java.nio.ByteBuffer): count=1, size=447850
        org.eclipse.jetty.io.ChannelEndPoint.flush(java.nio.ByteBuffer[]): count=1, size=447850
    sun.nio.ch.IOUtil.write(java.io.FileDescriptor, java.nio.ByteBuffer[], int, int, sun.nio.ch.NativeDispatcher): count=102, size=12819260
      sun.nio.ch.SocketChannelImpl.write(java.nio.ByteBuffer[], int, int): count=102, size=12819260
        org.eclipse.jetty.io.ChannelEndPoint.flush(java.nio.ByteBuffer[]): count=102, size=12819260
    sun.nio.ch.IOUtil.read(java.io.FileDescriptor, java.nio.ByteBuffer, long, sun.nio.ch.NativeDispatcher): count=21, size=196608
      sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer): count=21, size=196608
        org.eclipse.jetty.io.ChannelEndPoint.fill(java.nio.ByteBuffer): count=21, size=196608
The count is the number of direct buffers, and the size is the total size of buffers. According the monitoring result, those direct buffers are all allocated by JDK default implementation of SocketChannel, when the user pass a non-direct buffer to perform I/O. After read the source code of sun.nio.ch.Util, I found it uses a thread local cache to keep the direct buffer. While the sun.nio.ch.Util is designed carefully to limit the total number of cached buffers (8 per thread), and clean direct buffer while it's removed from the cache. The only possible reason is there are too many new threads, and the buffers in cache are not used at all.

Root Cause

Our application uses Jetty to handle HTTP requests, and there is a custom class wraps Jetty Server, in this wrapper class, a ThreadPoolExecutor is used like following:

ExecutorService executor = new ThreadPoolExecutor(
 minThreads,
 maxThreads,
 maxIdleTimeMs,
 TimeUnit.MILLISECONDS,
 new SynchronousQueue());
Server server = new Server(new ExecutorThreadPool(executor));
Unfortunately, the minThreads is 2 and maxIdleTimeMs is 5000ms, and the Jetty Server will use 12 threads from the pool for accepting and selector. Which means the total number of threads is always bigger than the minThreads. When the service is not very busy, a worker thread will be discarded, and a new worker thread will be created when a new request comes. In this situation, the cached buffers in sun.nio.ch.Util will no longer be used and will only been collected when GC triggered.

2015年1月2日星期五

Understanding about CMSInitiatingOccupancyFraction and UseCMSInitiatingOccupancyOnly

While reading the Useful JVM Flags – Part 7 (CMS Collector), I was impressed that CMSInitiatingOccupancyFraction was useless when UseCMSInitiatingOccupancyOnly is false (default) except the first CMS collection:
We can use the flag -XX+UseCMSInitiatingOccupancyOnly to instruct the JVM not to base its decision when to start a CMS cycle on run time statistics. Instead, when this flag is enabled, the JVM uses the value of CMSInitiatingOccupancyFraction for every CMS cycle, not just for the first one.
After checking the source code, I found this statement is inaccurate, a more accurate statement would be:
When UseCMSInitiatingOccupancyOnly is false (default), a CMS collection may be triggered even the actual occupancy is smaller than the specified CMSInitiatingOccupancyFraction value. In other words, when actual occupancy is greater than the specified CMSInitiatingOccupancyFraction value, a CMS collection will be triggered.

Detail Explanation

Code snippet from OpenJDK (openjdk/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp):

  // If the estimated time to complete a cms collection (cms_duration())
  // is less than the estimated time remaining until the cms generation
  // is full, start a collection.
  if (!UseCMSInitiatingOccupancyOnly) {
    if (stats().valid()) {
      if (stats().time_until_cms_start() == 0.0) {
        return true;
      }
    } else {
      // We want to conservatively collect somewhat early in order
      // to try and "bootstrap" our CMS/promotion statistics;
      // this branch will not fire after the first successful CMS
      // collection because the stats should then be valid.
      if (_cmsGen->occupancy() >= _bootstrap_occupancy) {
        if (Verbose && PrintGCDetails) {
          gclog_or_tty->print_cr(
            " CMSCollector: collect for bootstrapping statistics:"
            " occupancy = %f, boot occupancy = %f", _cmsGen->occupancy(),
            _bootstrap_occupancy);
        }
        return true;
      }
    }
  }

  // Otherwise, we start a collection cycle if either the perm gen or
  // old gen want a collection cycle started. Each may use
  // an appropriate criterion for making this decision.
  // XXX We need to make sure that the gen expansion
  // criterion dovetails well with this. XXX NEED TO FIX THIS
  if (_cmsGen->should_concurrent_collect()) {
    if (Verbose && PrintGCDetails) {
      gclog_or_tty->print_cr("CMS old gen initiated");
    }
    return true;
  }
In above code, the _cmsGen->should_concurrent_collect() is always been called, unless it's already determined that a collection is needed. In the implementation of _cmsGen->should_concurrent_collect(), the CMSInitiatingOccupancyFraction value is checked at beginning.

bool ConcurrentMarkSweepGeneration::should_concurrent_collect() const {

  assert_lock_strong(freelistLock());
  if (occupancy() > initiating_occupancy()) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because of occupancy %f / %f  ",
        short_name(), occupancy(), initiating_occupancy());
    }
    return true;
  }
  if (UseCMSInitiatingOccupancyOnly) {
    return false;
  }
  if (expansion_cause() == CMSExpansionCause::_satisfy_allocation) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because expanded for allocation ",
        short_name());
    }
    return true;
  }
  if (_cmsSpace->should_concurrent_collect()) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because cmsSpace says so ",
        short_name());
    }
    return true;
  }
  return false;
}
From the above code, it's easy to find out that CMSBootstrapOccupancy is been used for first collection if UseCMSInitiatingOccupancyOnly is false.

Summary

The UseCMSInitiatingOccupancyOnly need to be set to true only if you want to avoid the early collection before occupancy reaches the specified value. Looks it's not the case when CMSInitiatingOccupancyFraction is set to a small value. For example you application allocated direct buffers frequently and you may want to collect garbage even the old generation utilization is quite low.