# [concurrency-interest] FJ stack bounds

Doug Lea dl at cs.oswego.edu
Tue Dec 2 13:30:35 EST 2014

```On 12/01/2014 06:03 PM, thurstonn wrote:
> Hello,
>
> I was wondering if there was any way to compute a maximum bound for the
> stack size/height (in frames, not bytes) for a given FJ problem.
>
> So given n = # of FJT created.
> Let's keep the math simple and say n = 256
>
> RecursiveTasks which are then forked and joined (idiomatic FJ)
>
> At first glance, it would seem that this would be classic log(n) - but is
> it?  In the simple example that would mean a max stack height of 8.
>
> Given that worker threads engage in work-stealing (and combined with the
> fact that I can't tell from looking at the code what invariants there are
> (if any) in determining which actual FJT will be #exec()'d), is it possible
> that (given some unlucky and unlikely "timing"), that a single worker thread
> could effect a stack height much > 8? even, perhaps 256? (I realize this is
> unlikely, I'm just asking if it's possible)

The worst case under such scenarios is approximately
(lg(N) * (lg(N) - 1)) / 2 (base 2 logs). In your example, about
28 = 8 + 7 + 6 + ... 1).  A helping worker can steal a task
with depth greater than current task to join, then expand to
max depth before helping again, and so on up through max depth.
The "approximately" is due to ignoring possible transient

>
> As best I could tell, there seems to be some logic (#doJoin/#tryUnpush) that
> "defaults" to executing the joined task only if it is at the top of the
> current thread's task stack (workerqueue), else . . . try to "help" by
> executing some other (arbitrarily chosen?) task.
> Does that mean that:
> left = . . .fork()
> right = . . . fork()  //Top of this thread's worker queue
>
> return left.join() + right.join()
> is different than
> return right.join() + left.join()
> in terms of the possible stack heights that can result?

Yes, but the main impact is on #threads not #tasks per queue.

FJ efficiently implements parallel dag computation. When
joins are performed in non-LIFO order of forks, performance
can suffer: The caller cannot necessarily help perform the
topmost task needed to uncover second, so may need to block
until some other thread does, which may require generating
also encounter misordered joins.

So, always join in LIFO order. If you forget which order is
for multiple tasks, that remember this for you.

-Doug
```