[concurrency-interest] Use-cases and issues with the ForkJoinPool

Romain Colle rco at quartetfs.com
Mon Jan 16 10:48:54 EST 2012

Hi Doug,

Thanks a lot for the answers!
Please find some comments below.

The ultimate reason is that if someone says ta.join(); tb.join(); we
> are not in general allowed to reorder them. [...]

Already we see too many cases of people naively doing:
>  ta.fork(); tb.fork(); ta.join(); tb.join();
> and then seeing it "work" but with a huge blowup of compensation
> threads and crummy performance when it is recursive.

The need for us here is not to re-order joins.
As you said, the join order should be preserved as it may actually mean
On the other hand, when I write "ta.fork(); tb.fork()", I expect them
to potentially run in parallel or in any order, and my code should be
prepared for that.

Would there be something illegal about the following
handling of this case?
 1) ta is forked and put in the local queue
 2) tb is forked and put in the local queue
 3) ta.join() is called. It is found to still be in the local queue,
    so we just CAS its slot to null and execute it ourselves.
 4) tb.join() is called. We are in the usual case where we can just
    deque and execute it.

This does not require a claim bit and there are no risks of any
task being executed more than once.
There is an extra cost of getting ta from the local queue
 (especially if the queue is quite large), but the "well-structured"
FJ usage won't be affected by it.
Moreover, the larger the queue, the more likely the task is to be
in it.

I guess the more general question I had was: why does a joining thread
goes into idle wait (pool.tryAwaitJoin()) if it has a non-empty queue?
I'd like to at least see it "steal" from anywhere in its own queue,
and even go for helpJoinTask().

If none of this is possible, I still like the idea of the claim bit,
even though it adds some CAS overhead (is it really that much in some
cases btw?).
We could imagine a flag for this in the pool, like the locallyFifo one
if the cost turns out to be too much.

To clarify: The user is a non-FJ thread and the queries
> say q1, q2, are submitted using submit (not invoke), and then joined?
> If so, you are right that this can be a problem (in any execution
> framework) because the pool doesn't know that q1 must be performed
> before q2. (Calling p.invoke(q1); p.invoke(q2) would ensure this.)
> Under upcoming more relaxed submission queues, out of order processing
> will become even more common.

This is not really this. Let's use an almost real use-case.

A non-FJ thread submits a query Q to retrieve the values of the measures
M1 and M2, and that query is wrapped into a task T and submitted.
T is split into 2 sub-tasks: T1 that computes the value of M1 and T2 that
computes the value of M2.
M1 can be computed by itself, so T1 just computes its value when it gets
a chance to do so.
M2, however, turns to to be the sum of M1 and M3. We therefore create a
task T3 for computing M3 and run it, but we also need to join T1 and wait
for its result to get the value of M1.
Hence the out-of-order join.

What we do now to avoid this issue is that we first compute the query plan
in a single-threaded pass.
In this case, we can see that we can first compute M1 and M3 in parallel,
then compute M2 once this is done and finally return the result.

We initially thought we could avoid that single-threaded planning step and
simply rely on the work-stealing feature to have joining threads participate
in the global computation.
And we'd still like to be able to do it, especially if this only requires a
single CAS per task :-)

Romain Colle
Senior R&D Engineer
2 rue Jean Lantier, 75001 Paris, France
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120116/d5c2120c/attachment.html>

More information about the Concurrency-interest mailing list