[concurrency-interest] Improved FJ thread throttling

David Holmes davidcholmes at aapt.net.au
Tue Jul 8 17:36:03 EDT 2014

I'd prefer a system property rather than having to make VM changes (-XX) that then have to be communicated back to the Java code via a system property anyway.

  -----Original Message-----
  From: concurrency-interest-bounces at cs.oswego.edu [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of Joe Bowbeer
  Sent: Wednesday, 9 July 2014 5:14 AM
  To: Doug Lea
  Cc: concurrency-interest
  Subject: Re: [concurrency-interest] Improved FJ thread throttling

  Adding a -X command line option seems like a good step, and consistent with some other resource limits that can be tweaked on the command line (in a non-standard way).

  On Jul 8, 2014 10:41 AM, "Doug Lea" <dl at cs.oswego.edu> wrote:

    On 07/08/2014 01:16 PM, √iktor Ҡlang wrote:

      Hi Doug,

      How about allowing a System Property to set the number of spares, with the
      default being 256 so one does not have to implement a ThreadFactory to cap it to
      something different? (Given that there are System Properties for most of the
      other setting for the common pool)

    Sorry, I knew that question was coming so should have addressed it
    in first post! We need to be confident that the JVM can respond
    to errors/problems, so the limit is mostly a JVM property.
    If you want to set it higher, then JVMs might not comply.
    On the other side, if you want to set it significantly lower,
    then chances of a false-alarm exception because counters don't match
    underlying resources becomes very high. So all together the
    plausible range is around 32-256 spares. It's a little scary
    to just attach these considerations to a System  property,
    but I'm not otherwise against it.


      On Tue, Jul 8, 2014 at 7:00 PM, Doug Lea <dl at cs.oswego.edu
      <mailto:dl at cs.oswego.edu>> wrote:

          ForkJoin extensions and adaptations for JDK8 (streams etc) included
          overly course-grained thread throttling. This was on the to-do-list
          for a while. A new update addresses this.

          Context: By design, ForkJoinPool relies on only a single parameter,
          target parallelism. It takes all responsibility for ensuring that the
          "right" number of threads are running at any given time for
          data-parallel and async applications.  It is impossible to even define
          what the "right number" is, so it is impossible for us or anyone else
          to get this exactly right.  (This is different than for example
          setting up N services using a newFixedThreadPool(N), where the only
          right answer is N.)  One approach to dealing with this would be to
          introduce a zillion controls that would be even harder to use and
          prone to even more policy inconsistency and context-dependence
          problems than seen with ThreadPoolExecutor. This would be a throwback
          to the days when every efficient parallel program had to be custom
          built. Some people think that people should still write parallel
          programs this way (please feel free to do so.)  FJ instead implements
          portable algorithms and internal policies that are rarely optimal for
          any given platform and application but often close to optimal.  As FJ
          is used for increasingly diverse purposes, getting thread throttling
          approximately "right" in all cases gets more challenging. But we like

          Aside: The situation is very similar to that for ConcurrentHashMap,
          that also only accepts only one optional parameter (capacity). If you
          have special requirements, you may be able to create a custom map that
          outperforms CHM. But over time, CHM evolves to benefit from diverse
          usage experiences, so customization becomes less likely to be

          More background: Any given parallel computation (including one just
          using FJ for asyncs) might, for good performance, need fewer than the
          target threads, the same number, or, if some dependent computations
          block waiting for others, possibly more threads to compensate for the
          blocked ones. (See below about blocking for other reasons.)  The
          "more" case is now less common than before, but you can't ignore it
          without risk of locking up computations.  And in some cases of mixed
          parallel/clustered systems, this could lead to distributed deadlock.
          (Note: The number of spare threads needed has little to do with the
          target parallelism level, but instead the form of the parallel
          computation dag.)  So, even though creating more than a dozen spare
          threads is rare, FJ itself imposed only a ceiling (32K threads) that
          is so high that programs typically die for other reasons before
          reaching it; and documents only an intentionally vague implementation
          note that "This implementation rejects submitted tasks (that is, by
          throwing RejectedExecutionException) only when the pool is shut down
          or internal resources have been exhausted."

          One disadvantage of this policy is that the ceiling is so high that
          programming mistakes (for example those with infinitely nested joins)
          or intentional abuses are usually not caught in a very nice way. For
          example, on Unix-based systems, people might encounter "No more
          processes" just trying to kill the program.  Especially when
          implementing the JDK8 Common Pool, we should have dropped this limit
          from being a thousand times larger than expected under normal use down
          to a value that far exceeds that needed in any practical program, but
          still gives JVMs a chance to recover. So the update includes an
          absolute ceiling of 256 more threads than the target parallelism (or
          the original total of 32K, whichever is lower.)  This will not impact
          any current practical programs except those that by chance never ran
          long enough to hit higher limits.  The value 256 is somewhat
          arbitrary. It's the highest value for which any multicore JVM is
          expected to always have enough resources to recover from.  By choosing
          a conservatively high value, there is good justification for including
          this in a JDK8 update and additionally being more aggressive about
          killing off spare threads.  To further limit behavior, we still also
          allow users to supply ThreadFactories that throw exceptions after
          hitting some maximum, but still don't particularly recommend use. Most
          implementations of external limits are arbitrarily imprecise in part
          because they cannot tell when threads are really gone: decrementing a
          count does not necessarily mean that the thread has stopped or its
          resources have been recovered.  (The internal bounds have some of the
          same problems, but handle them conservatively.)

          But the main story with this update is improved internal tracking that
          is usually much closer to running the right number of threads than
          before.  (This version also includes more and better internal
          documentation and refactorings that take advantage of JVM improvements
          that have occurred since JDK6, for example being much better than
          before at compiling 64bit logical operations without needing to code
          by splitting into 32bit parts.)

          The only version available is in our jsr166 main (JDK8/JDK9 only)
          repository, with the aim of having some of you try it out before
          considering integration into OpenJDK.  To use it, you can either use
          the jar at http://gee.cs.oswego.edu/dl/__concurrent/dist/jsr166.jar
          <http://gee.cs.oswego.edu/dl/concurrent/dist/jsr166.jar> and
          run with -Xbootclasspath/p:jsr166.jar; or copy into an OpenJDK and
          build files:


          Also, a few notes about blocking threads in any context (in FJ, other
          Executors, even the JVM itself).  Whenever there is some bound on
          thread construction, and threads start blocking, eventually programs
          will freeze or throw exceptions.  We can't/won't forbid all blocking
          because it is often harmlessly transient.  On the other hand, most
          programs deal with saturation effects of long-term blocking about as
          well as they deal with other resource failures (out of memory etc),
          which is not very well. But coping mechanisms do always exist.  FJ
          provides a thread-vs-memory tradeoff hook via ManagedBlocker: If a
          task blocks but you want to ensure liveness for processing other tasks
          use a ManagedBlocker. If you are content to let other work pile up
          unless/until blocked threads resume, don't use it.  This is not always
          an easy decision to make, but cannot be automated because the number
          of ways/reasons that tasks may block is unbounded.  Note:
          ThreadPoolExecutor cannot use this approach, so instead supports
          RejectedExecutionHandlers for use in similar situations.


          Concurrency-interest mailing list
          Concurrency-interest at cs.__oswego.edu <mailto:Concurrency-interest at cs.oswego.edu>


    Concurrency-interest mailing list
    Concurrency-interest at cs.oswego.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20140709/911eb926/attachment-0001.html>

More information about the Concurrency-interest mailing list