[concurrency-interest] jsr166e.ForkJoinPool performance issue

Doug Lea dl at cs.oswego.edu
Mon Apr 8 13:22:35 EDT 2013

On 04/08/13 13:08, Ron Pressler wrote:
> Now, we're trying to use FJ for scheduling a message-passing component, and I've
> noticed a performance issue that the code below demonstrates. This is a
> degenerate case where only one FJTask is active at a time -- a single task is
> submitted, which forks a single task and terminates. The new task subsequently
> forks another one and so on.
> When the parallelism level is set to 1, a run completes in about 35ms on my
> machine. When it is set to 4 (I'm running on a 4-core i7, 8 virtual cores), that
> duration rises above 120ms. The NetBeans profiler shows the hot-spot to be
> ForkJoinPool.WorkQueue.push(), and, in particular, its call to
> ForkJoinPool.signalWork. The NetBeans profiler provides far-from-definite proof,
> but further testing has shown that this may, in fact, be true. In our code, when
> some computation takes place in the tasks, NetBeans attributes almost 40% of CPU
> time to WorkQueue.push().
> It seems that this should actually be a simple case, with only one non-idle
> Worker continually executing each new task as it forks. However, it seems like
> in this case (and unlike with our other FJ uses), the pool is sensitive to the
> parallelism level, which could be a problem.

The underlying issue is that the pushing task does not know that
the single worker is already available, so activates another.
(It can take a few dozen nanoseconds for workers to rescan before
idling.) So it is not so much parallelism-level as intrinsic raciness.
This turns out to be a common issue when processing small Streams
in upcoming jdk8 support, so I've been working to improve it.
Stay tuned...


More information about the Concurrency-interest mailing list