2015年1月2日星期五

Understanding about CMSInitiatingOccupancyFraction and UseCMSInitiatingOccupancyOnly

While reading the Useful JVM Flags – Part 7 (CMS Collector), I was impressed that CMSInitiatingOccupancyFraction was useless when UseCMSInitiatingOccupancyOnly is false (default) except the first CMS collection:
We can use the flag -XX+UseCMSInitiatingOccupancyOnly to instruct the JVM not to base its decision when to start a CMS cycle on run time statistics. Instead, when this flag is enabled, the JVM uses the value of CMSInitiatingOccupancyFraction for every CMS cycle, not just for the first one.
After checking the source code, I found this statement is inaccurate, a more accurate statement would be:
When UseCMSInitiatingOccupancyOnly is false (default), a CMS collection may be triggered even the actual occupancy is smaller than the specified CMSInitiatingOccupancyFraction value. In other words, when actual occupancy is greater than the specified CMSInitiatingOccupancyFraction value, a CMS collection will be triggered.

Detail Explanation

Code snippet from OpenJDK (openjdk/hotspot/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp):

  // If the estimated time to complete a cms collection (cms_duration())
  // is less than the estimated time remaining until the cms generation
  // is full, start a collection.
  if (!UseCMSInitiatingOccupancyOnly) {
    if (stats().valid()) {
      if (stats().time_until_cms_start() == 0.0) {
        return true;
      }
    } else {
      // We want to conservatively collect somewhat early in order
      // to try and "bootstrap" our CMS/promotion statistics;
      // this branch will not fire after the first successful CMS
      // collection because the stats should then be valid.
      if (_cmsGen->occupancy() >= _bootstrap_occupancy) {
        if (Verbose && PrintGCDetails) {
          gclog_or_tty->print_cr(
            " CMSCollector: collect for bootstrapping statistics:"
            " occupancy = %f, boot occupancy = %f", _cmsGen->occupancy(),
            _bootstrap_occupancy);
        }
        return true;
      }
    }
  }

  // Otherwise, we start a collection cycle if either the perm gen or
  // old gen want a collection cycle started. Each may use
  // an appropriate criterion for making this decision.
  // XXX We need to make sure that the gen expansion
  // criterion dovetails well with this. XXX NEED TO FIX THIS
  if (_cmsGen->should_concurrent_collect()) {
    if (Verbose && PrintGCDetails) {
      gclog_or_tty->print_cr("CMS old gen initiated");
    }
    return true;
  }
In above code, the _cmsGen->should_concurrent_collect() is always been called, unless it's already determined that a collection is needed. In the implementation of _cmsGen->should_concurrent_collect(), the CMSInitiatingOccupancyFraction value is checked at beginning.

bool ConcurrentMarkSweepGeneration::should_concurrent_collect() const {

  assert_lock_strong(freelistLock());
  if (occupancy() > initiating_occupancy()) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because of occupancy %f / %f  ",
        short_name(), occupancy(), initiating_occupancy());
    }
    return true;
  }
  if (UseCMSInitiatingOccupancyOnly) {
    return false;
  }
  if (expansion_cause() == CMSExpansionCause::_satisfy_allocation) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because expanded for allocation ",
        short_name());
    }
    return true;
  }
  if (_cmsSpace->should_concurrent_collect()) {
    if (PrintGCDetails && Verbose) {
      gclog_or_tty->print(" %s: collect because cmsSpace says so ",
        short_name());
    }
    return true;
  }
  return false;
}
From the above code, it's easy to find out that CMSBootstrapOccupancy is been used for first collection if UseCMSInitiatingOccupancyOnly is false.

Summary

The UseCMSInitiatingOccupancyOnly need to be set to true only if you want to avoid the early collection before occupancy reaches the specified value. Looks it's not the case when CMSInitiatingOccupancyFraction is set to a small value. For example you application allocated direct buffers frequently and you may want to collect garbage even the old generation utilization is quite low.

没有评论: