[concurrency-interest] Some interesting (confusing?) benchmark results

Joe Bowbeer joe.bowbeer at gmail.com
Mon Feb 11 23:08:31 EST 2013

It looks to me like your code is completely subdividing the task, creating
a huge computation tree consisting of tiny tasks.

Instead, I recommend the refinement that is illustrated in the sumOfSquares
example in the javadoc:


The getSurplusQueuedTaskCount is key:

while (h - l > 1 && getSurplusQueuedTaskCount() <= 3) {

Also, I suggest comparing notes with this article by Brian Goetz:

  Accelerate sorting and searching with the ParallelArray classes in Java 7

You might also consider comparing your technique with that employed in



On Mon, Feb 11, 2013 at 12:52 PM, T wrote:

> Hello,
> I made some initial attempts at using ForkJoin framework for some rather
> obvious recursively parallel use-cases, namely summing the elements of an
> int[] and also sorting an int[] (randomly filled).
> The problem-sizes I used ranged from 1M - 10M.
> I really enjoy using the framework and find it relatively easy to reason
> about my programs (if not necessarily the internals of the framework).
> But I was disappointed with the results: in both cases they were slower
> than the corresponding Java serial implementation.
> I completely understand the principle of YMMV, and I wasn't expecting 2x
> speedup, but especially in the case of the sort, but I was expecting that
> ForkJoin would do at least a little better than the single-threaded
> version.  I guess what I'm asking is: are these results surprising?  Does
> it mean I'm doing something wrong?
> I'll give a brief description of my implementation:
> single-threaded:  Arrays.sort(int[])
> ForkJoin: I'm using a merge-sort, with THRESHOLD = 250 (arrived at by
> trial-and-error)
> int[] compute()
> {
> if int[].length < THRESHOLD
>     return insertionSort(int[])
> left = new MergeSort(int[] //split in half)
> right = new MergeSort(int[] //other half)
> return merge(right.compute(), left.join())
> }
> The insertionSort() and merge() methods are just standard implementations;
> there is a bit of apples to oranges comparison since Arrays.sort() uses an
> optimized quicksort, but we're still talking O(nlog(n))
> Just ran it on my laptop:
> Windows 7 64-bit
> 1.7.0_03; Java HotSpot(TM) 64-Bit Server VM 22.1-b02
> Core 2 Duo==> 2 cores 2GHz
> 2M int[]:
> single-threaded: ~330ms
> ForkJoin (2 workers): ~390ms
> Would appreciate any feedback and hopefully the #s are somewhat helpful
> (and please no scoffawing at my antiquated machine)
> -T
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130211/93d9daad/attachment.html>

More information about the Concurrency-interest mailing list