[concurrency-interest] ForkJoinPool with custom number of WorkQueues

Carl Mastrangelo notcarl at google.com
Thu Nov 15 20:08:53 EST 2018


Response inline:

On Sun, Nov 11, 2018 at 6:25 AM Doug Lea via Concurrency-interest <
concurrency-interest at cs.oswego.edu> wrote:

>
> I agree that something should be done to improve performance in cases
> where the optimal steady state is to engage only a subset of available
> worker threads. I'll explore other options (which has been on todo-list
> for too long). None of those we've tried so far are very good:
>
>
> On 11/9/18 2:52 AM, Francesco Nigro via Concurrency-interest wrote:
> > In the past implementations of the FJ pool there was a SPIN static
> > variable used for this purpose that where quite good (with
> > parallelism<available cores) to not park immediately the executor
> > threads where is nothing to do.
>
> ... with the disadvantage of unacceptably high CPU wastage on machines
> running near CPU saturation, as we discovered only after releasing, and
> so later removed. There are now only a few j.u.c classes that include
> paths allowing a non-trivial amount of spinning, and some of these might
> be subject to reconsideration in light of Loom, in which spinning is
> rarely a good idea.
>
> >
> > I have the same problem of Carl and I have found a similar "solution":
> > reduce the parallelism hoping to get the queues more busy, but it would
> > in incour in an higher contention on offer side.
>
> When steady state has little variance in task execution rates, reducing
> parallelism *is* the right solution. But has the disadvantage of poorer
> throughput when a lot of tasks suddenly appear.
>
> >
> > It wouldn't make sense to inject some WaitStrategy to allow the user to
> > choose what to do before parking (if parking)?
>
> I'm not a big fan of forcing users to make such decisions, since they
> can/will encounter the same issues as algorithmic solutions do, as in
> Carl's situation below.
>
> >
> > Il giorno ven 9 nov 2018, 00:18 Carl Mastrangelo via
> > Concurrency-interest <concurrency-interest at cs.oswego.edu
> > <mailto:concurrency-interest at cs.oswego.edu>> ha scritto:
> >
> >     Hi,
> >
> >     I am using ForkJoinPool as a means to avoid lock contention for
> >     submitting tasks to an executor.  FJP is much faster than before,
> >     but has some unexpected slowdown when there is not enough work.   In
> >     particular, A significant amount of time is spent waiting parking
> >     and unparking threads when there no work to do.  When there is no
> >     active work, it seems each worker scans the other work queues
> >     looking for work before going to sleep.
> >
> >     In my program I have a parallelism of 64, because when the work load
> >     is high, each of the threads can be active.  However, when work load
> >     is low, the workers spend too much time looking for more work.
>
> Besides keeping too many threads alive, the main performance issue here
> is reduced locality because related tasks are taken by different
> threads. So ...
>
> >
> >     One way to fix this (I think) is to lower the number of worker
> >     queues, but keep the same number of workers.   In my case, having 32
> >     or 16 queues rather than having exactly 64 might help, but I have no
> >     way of testing it out.   Has this ever been brought up, or is it
> >     possible for me to easily patch FJP to see if it would help?
>
> ... I don't think this will be a dramatic improvement, but if you are
> interested in experimenting, you could try building a jsr166.jar after
> changing ForkJoinPool line 703 from:
>     static final int SQMASK       = 0x007e;   // max 64 (even) slots
>     static final int SQMASK       = 0x001e;   // max 16 (even) slots
>

Thanks for the tip.   I tried this out, (and 0x003e), and as you said it
hardly helped out.  In my profile, most of the time was moved from
park/unpark over to scan, making it mostly a wash.

Here is what my profiler says:

32 parallelism, 16 workqueues
8.43 s 23.33% ForkJoinPool2.scan(ForkJoinPool2$WorkQueue, int)
(ForkJoinPool2.java)
360 ms 0.996% ForkJoinPool2.externalPush(ForkJoinTask2) (ForkJoinPool2.java)
300 ms 0.830% ForkJoinPool2.signalWork(ForkJoinPool2$WorkQueue[],
ForkJoinPool2$WorkQueue) (ForkJoinPool2.java)
270 ms 0.747% ForkJoinPool2.awaitWork(ForkJoinPool2$WorkQueue, int)
(ForkJoinPool2.java)
160 ms 0.443% ForkJoinPool2$WorkQueue.runTask(ForkJoinTask2)
(ForkJoinPool2.java)
140 ms 0.387% ForkJoinPool2.runWorker(ForkJoinPool2$WorkQueue)
(ForkJoinPool2.java)
90 ms 0.249% ForkJoinPool2.execute(java.lang.Runnable) (ForkJoinPool2.java)
50 ms 0.138% ForkJoinTask2.setCompletion(int) (ForkJoinTask2.java)
20 ms 0.055% ForkJoinTask2.doExec (ForkJoinTask2.java)
10 ms 0.028% ForkJoinPool2.tryRelease(long, ForkJoinPool2$WorkQueue, long)
(ForkJoinPool2.java)
10 ms 0.028% ForkJoinTask2$RunnableExecuteAction.exec (ForkJoinTask2.java)


32 parallelism, 32 workqueues
6.82 s 18.80% ForkJoinPool2.scan(ForkJoinPool2$WorkQueue, int)
(ForkJoinPool2.java)
480 ms 1.323% ForkJoinPool2.externalPush(ForkJoinTask2) (ForkJoinPool2.java)
400 ms 1.103% ForkJoinPool2.awaitWork(ForkJoinPool2$WorkQueue, int)
(ForkJoinPool2.java)
320 ms 0.882% ForkJoinPool2.signalWork(ForkJoinPool2$WorkQueue[],
ForkJoinPool2$WorkQueue) (ForkJoinPool2.java)
160 ms 0.441% ForkJoinTask2.doExec (ForkJoinTask2.java)
110 ms 0.303% ForkJoinPool2$WorkQueue.runTask(ForkJoinTask2)
(ForkJoinPool2.java)
50 ms 0.138% ForkJoinTask2.setCompletion(int) (ForkJoinTask2.java)
20 ms 0.055% ForkJoinPool2.tryRelease(long, ForkJoinPool2$WorkQueue, long)
(ForkJoinPool2.java)
10 ms 0.028% ForkJoinPool2.runWorker(ForkJoinPool2$WorkQueue)
(ForkJoinPool2.java)
10 ms 0.028% ForkJoinPool2$WorkQueue.execLocalTasks (ForkJoinPool2.java)
10 ms 0.028% ForkJoinPool2.execute(java.lang.Runnable) (ForkJoinPool2.java)




>
> -Doug
>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20181115/ca516297/attachment.html>


More information about the Concurrency-interest mailing list