[concurrency-interest] ForkJoinPool not designed for nested Java 8 streams.parallel().forEach( ... )

Paul Sandoz paul.sandoz at oracle.com
Tue May 6 06:47:33 EDT 2014

Hi Chris,

I think the use of Thread.sleep is a little misleading, since it is a blocking operation. Try placing all those sleep statements in managed blocks and you will get different results e.g. with:

    interface Blocker {
        void block() throws InterruptedException;

    ForkJoinPool.ManagedBlocker blocker(Blocker b) {
        return new ForkJoinPool.ManagedBlocker() {
            boolean finished = false;
            public boolean block() throws InterruptedException {
                return finished = true;

            public boolean isReleasable() {
                return finished;

  ForkJoinPool.managedBlock(blocker(() -> Thread.sleep(N)));

Assuming *non-blocking* operations an enclosed (nested) parallel stream will compete for the same common pool resources with the enclosing parallel stream, something i have advised against doing. I don't know if that would be significantly different from equivalent work performed by two independent parallel streams executed concurrently; need to measure as Russel says.

I think we could do a better job documenting the hazards of nested parallelism. Also pondering if it is appropriate and possible to disable the parallelism on a nested parallel stream.


On May 6, 2014, at 10:09 AM, Christian Fries <email at christian-fries.de> wrote:

> Dear All.
> Thank you for your replies.
> At Stackoverflow people immediately reacted to the use of the semaphore as an obvious problem in my code. They are correct! However, I have the impression that the problem in the FJP is there and not related to this. So I created an example without that semaphore, you find it at:
> http://svn.finmath.net/finmath%20experiments/trunk/src/net/finmath/experiments/concurrency/NestedParallelForEachTest.java
> and I appended it below. Given that I would rephrase the problem as an unexpected performance issue.
> Let me describe the setup:
> We have a nested stream.parallel().forEach(). The inner loop is independent (stateless, no interference, etc. - except of the use of a common pool) and consumes 1 second in total in the worst case, namely if processed sequential. Half of the tasks of the outer loop consume 10 seconds prior that loop. Half consume 10 seconds after that loop. We have a boolean which allows to switch the inner loop from prallel() to sequential(). Hence every thread consumes 11 seconds (worst case) in total. Now: submitting 24 outer-loop-tasks to a pool of 8 we would expect 24/8 * 11 = 33 seconds at best (on an 8 core or better machine).
> The result is:
> - With inner loop sequential:	33 seconds.
> - With inner loop parallel:		>80 seconds (I had 93 seconds).
> Can you confirm this behavior on your machine? Mine is a Mid 2012 MBP 2.6 i7.
> Darwin Vesper.local 13.1.0 Darwin Kernel Version 13.1.0: Wed Apr  2 23:52:02 PDT 2014; root:xnu-2422.92.1~2/RELEASE_X86_64 x86_64
> java version "1.8.0_05"
> Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)
> @Alexey: I believe the problem is induced by the awaitJoin called on the wrong queue due to that test I mentioned. This introduced a coupling where inner tasks wait on outer task. I have a workaround where you can nest parallel loops to the same cp and the problem goes away (wrap the inner loop in its own thread).
> Best
> Christian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20140506/808a0e27/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20140506/808a0e27/attachment.bin>

More information about the Concurrency-interest mailing list