[concurrency-interest] Improved FJ thread throttling

Doug Lea dl at cs.oswego.edu
Tue Jul 8 13:39:30 EDT 2014


On 07/08/2014 01:16 PM, √iktor Ҡlang wrote:
> Hi Doug,
>
> How about allowing a System Property to set the number of spares, with the
> default being 256 so one does not have to implement a ThreadFactory to cap it to
> something different? (Given that there are System Properties for most of the
> other setting for the common pool)

Sorry, I knew that question was coming so should have addressed it
in first post! We need to be confident that the JVM can respond
to errors/problems, so the limit is mostly a JVM property.
If you want to set it higher, then JVMs might not comply.
On the other side, if you want to set it significantly lower,
then chances of a false-alarm exception because counters don't match
underlying resources becomes very high. So all together the
plausible range is around 32-256 spares. It's a little scary
to just attach these considerations to a System  property,
but I'm not otherwise against it.

-Doug

>
>
>
> On Tue, Jul 8, 2014 at 7:00 PM, Doug Lea <dl at cs.oswego.edu
> <mailto:dl at cs.oswego.edu>> wrote:
>
>     ForkJoin extensions and adaptations for JDK8 (streams etc) included
>     overly course-grained thread throttling. This was on the to-do-list
>     for a while. A new update addresses this.
>
>     Context: By design, ForkJoinPool relies on only a single parameter,
>     target parallelism. It takes all responsibility for ensuring that the
>     "right" number of threads are running at any given time for
>     data-parallel and async applications.  It is impossible to even define
>     what the "right number" is, so it is impossible for us or anyone else
>     to get this exactly right.  (This is different than for example
>     setting up N services using a newFixedThreadPool(N), where the only
>     right answer is N.)  One approach to dealing with this would be to
>     introduce a zillion controls that would be even harder to use and
>     prone to even more policy inconsistency and context-dependence
>     problems than seen with ThreadPoolExecutor. This would be a throwback
>     to the days when every efficient parallel program had to be custom
>     built. Some people think that people should still write parallel
>     programs this way (please feel free to do so.)  FJ instead implements
>     portable algorithms and internal policies that are rarely optimal for
>     any given platform and application but often close to optimal.  As FJ
>     is used for increasingly diverse purposes, getting thread throttling
>     approximately "right" in all cases gets more challenging. But we like
>     challenges.
>
>     Aside: The situation is very similar to that for ConcurrentHashMap,
>     that also only accepts only one optional parameter (capacity). If you
>     have special requirements, you may be able to create a custom map that
>     outperforms CHM. But over time, CHM evolves to benefit from diverse
>     usage experiences, so customization becomes less likely to be
>     worthwhile.
>
>     More background: Any given parallel computation (including one just
>     using FJ for asyncs) might, for good performance, need fewer than the
>     target threads, the same number, or, if some dependent computations
>     block waiting for others, possibly more threads to compensate for the
>     blocked ones. (See below about blocking for other reasons.)  The
>     "more" case is now less common than before, but you can't ignore it
>     without risk of locking up computations.  And in some cases of mixed
>     parallel/clustered systems, this could lead to distributed deadlock.
>     (Note: The number of spare threads needed has little to do with the
>     target parallelism level, but instead the form of the parallel
>     computation dag.)  So, even though creating more than a dozen spare
>     threads is rare, FJ itself imposed only a ceiling (32K threads) that
>     is so high that programs typically die for other reasons before
>     reaching it; and documents only an intentionally vague implementation
>     note that "This implementation rejects submitted tasks (that is, by
>     throwing RejectedExecutionException) only when the pool is shut down
>     or internal resources have been exhausted."
>
>     One disadvantage of this policy is that the ceiling is so high that
>     programming mistakes (for example those with infinitely nested joins)
>     or intentional abuses are usually not caught in a very nice way. For
>     example, on Unix-based systems, people might encounter "No more
>     processes" just trying to kill the program.  Especially when
>     implementing the JDK8 Common Pool, we should have dropped this limit
>     from being a thousand times larger than expected under normal use down
>     to a value that far exceeds that needed in any practical program, but
>     still gives JVMs a chance to recover. So the update includes an
>     absolute ceiling of 256 more threads than the target parallelism (or
>     the original total of 32K, whichever is lower.)  This will not impact
>     any current practical programs except those that by chance never ran
>     long enough to hit higher limits.  The value 256 is somewhat
>     arbitrary. It's the highest value for which any multicore JVM is
>     expected to always have enough resources to recover from.  By choosing
>     a conservatively high value, there is good justification for including
>     this in a JDK8 update and additionally being more aggressive about
>     killing off spare threads.  To further limit behavior, we still also
>     allow users to supply ThreadFactories that throw exceptions after
>     hitting some maximum, but still don't particularly recommend use. Most
>     implementations of external limits are arbitrarily imprecise in part
>     because they cannot tell when threads are really gone: decrementing a
>     count does not necessarily mean that the thread has stopped or its
>     resources have been recovered.  (The internal bounds have some of the
>     same problems, but handle them conservatively.)
>
>     But the main story with this update is improved internal tracking that
>     is usually much closer to running the right number of threads than
>     before.  (This version also includes more and better internal
>     documentation and refactorings that take advantage of JVM improvements
>     that have occurred since JDK6, for example being much better than
>     before at compiling 64bit logical operations without needing to code
>     by splitting into 32bit parts.)
>
>     The only version available is in our jsr166 main (JDK8/JDK9 only)
>     repository, with the aim of having some of you try it out before
>     considering integration into OpenJDK.  To use it, you can either use
>     the jar at http://gee.cs.oswego.edu/dl/__concurrent/dist/jsr166.jar
>     <http://gee.cs.oswego.edu/dl/concurrent/dist/jsr166.jar> and
>     run with -Xbootclasspath/p:jsr166.jar; or copy into an OpenJDK and
>     build files:
>     http://gee.cs.oswego.edu/cgi-__bin/viewcvs.cgi/jsr166/src/__main/java/util/concurrent/__ForkJoinPool.java?view=log
>     <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinPool.java?view=log>
>     http://gee.cs.oswego.edu/cgi-__bin/viewcvs.cgi/jsr166/src/__main/java/util/concurrent/__ForkJoinTask.java?view=log
>     <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinTask.java?view=log>
>     http://gee.cs.oswego.edu/cgi-__bin/viewcvs.cgi/jsr166/src/__main/java/util/concurrent/__ForkJoinWorkerThread.java?__view=log
>     <http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ForkJoinWorkerThread.java?view=log>
>
>     ...
>
>     Also, a few notes about blocking threads in any context (in FJ, other
>     Executors, even the JVM itself).  Whenever there is some bound on
>     thread construction, and threads start blocking, eventually programs
>     will freeze or throw exceptions.  We can't/won't forbid all blocking
>     because it is often harmlessly transient.  On the other hand, most
>     programs deal with saturation effects of long-term blocking about as
>     well as they deal with other resource failures (out of memory etc),
>     which is not very well. But coping mechanisms do always exist.  FJ
>     provides a thread-vs-memory tradeoff hook via ManagedBlocker: If a
>     task blocks but you want to ensure liveness for processing other tasks
>     use a ManagedBlocker. If you are content to let other work pile up
>     unless/until blocked threads resume, don't use it.  This is not always
>     an easy decision to make, but cannot be automated because the number
>     of ways/reasons that tasks may block is unbounded.  Note:
>     ThreadPoolExecutor cannot use this approach, so instead supports
>     RejectedExecutionHandlers for use in similar situations.
>
>
>     -Doug
>
>     _________________________________________________
>     Concurrency-interest mailing list
>     Concurrency-interest at cs.__oswego.edu <mailto:Concurrency-interest at cs.oswego.edu>
>     http://cs.oswego.edu/mailman/__listinfo/concurrency-interest
>     <http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>
>
>
>
> --
> Cheers,
>




More information about the Concurrency-interest mailing list