[concurrency-interest] Performance regression in newest (6/20/2013) jsr166 update
Ron Pressler
ron.pressler at gmail.com
Sat Jun 22 20:42:32 EDT 2013
Hi.
A couple of months ago I described the behavior detailed in the email
exchange below. The newest update causes a slowdown of about 25% of this
benchmark:
import java.util.concurrent.TimeUnit;
import jsr166e.ForkJoinPool;
import jsr166e.ForkJoinTask;
import jsr166e.RecursiveAction;
public class FJBenchmark {
static final int PARALLELISM = 4;
static final int COUNT = 1000000;
static ForkJoinPool fjPool = new ForkJoinPool(PARALLELISM,
ForkJoinPool.defaultForkJoinWorkerThreadFactory, null, true);
public static void main(String[] args) throws Exception {
for (int i = 0; i < 10; i++)
run(COUNT);
}
static void run(int count) throws Exception {
RecursiveAction lastTask = new RecursiveAction() {
protected void compute() {
}
};
final long start = System.nanoTime();
fjPool.submit(new MyTask(count, lastTask));
lastTask.get();
System.out.println("count: " + count + " time: " +
TimeUnit.MILLISECONDS.convert(System.nanoTime() - start,
TimeUnit.NANOSECONDS));
}
static class MyTask extends RecursiveAction {
final int count;
final ForkJoinTask lastTask;
public MyTask(int count, ForkJoinTask lastTask) {
this.count = count;
this.lastTask = lastTask;
}
protected void compute() {
if (count > 0)
new MyTask(count - 1, lastTask).fork();
else
lastTask.fork();
}
}
}
---------- Forwarded message ----------
From: Doug Lea <dl at cs.oswego.edu>
Date: Mon, Apr 8, 2013 at 8:22 PM
Subject: Re: [concurrency-interest] jsr166e.ForkJoinPool performance issue
To: concurrency-interest at cs.oswego.edu
On 04/08/13 13:08, Ron Pressler wrote:
> Now, we're trying to use FJ for scheduling a message-passing component,
> and I've
> noticed a performance issue that the code below demonstrates. This is a
> degenerate case where only one FJTask is active at a time -- a single task
> is
> submitted, which forks a single task and terminates. The new task
> subsequently
> forks another one and so on.
>
> When the parallelism level is set to 1, a run completes in about 35ms on my
> machine. When it is set to 4 (I'm running on a 4-core i7, 8 virtual
> cores), that
> duration rises above 120ms. The NetBeans profiler shows the hot-spot to be
> ForkJoinPool.WorkQueue.push(), and, in particular, its call to
> ForkJoinPool.signalWork. The NetBeans profiler provides far-from-definite
> proof,
> but further testing has shown that this may, in fact, be true. In our
> code, when
> some computation takes place in the tasks, NetBeans attributes almost 40%
> of CPU
> time to WorkQueue.push().
>
> It seems that this should actually be a simple case, with only one non-idle
> Worker continually executing each new task as it forks. However, it seems
> like
> in this case (and unlike with our other FJ uses), the pool is sensitive to
> the
> parallelism level, which could be a problem.
>
>
The underlying issue is that the pushing task does not know that
the single worker is already available, so activates another.
(It can take a few dozen nanoseconds for workers to rescan before
idling.) So it is not so much parallelism-level as intrinsic raciness.
This turns out to be a common issue when processing small Streams
in upcoming jdk8 support, so I've been working to improve it.
Stay tuned...
-Doug
______________________________**_________________
Concurrency-interest mailing list
Concurrency-interest at cs.**oswego.edu <Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130623/6d3348e4/attachment.html>
More information about the Concurrency-interest
mailing list