[concurrency-interest] Some interesting (confusing?) benchmark results

thurston at nomagicsoftware.com thurston at nomagicsoftware.com
Mon Feb 11 16:49:18 EST 2013


Well, it's the algorithm that's dispositive in this case.
Insertion-sort is empirically better than mergesort up to around 
25-sized arrays.  That's where I started, and I played around with 
different thresholds and 250 seemed to be the best.
Anything larger than 1000 you start to see significant degradation.
I tried your formula (approximately) with N = 2 000 000 ==> 250K

That brought my laptop to its knees (as expected): over 33K ms as 
opposed to 390ms with 250.

Following my own admonition (the algorithm matters), it occurs to me 
that I just use my own serial mergesort as comparison to the Forkjoin 
(as opposed to Arrays.sort).  I'll let y'all know the results of that

-T



On 2013-02-11 13:19, David Holmes wrote:
> Your sequential threshold is far too small for only two worker 
> threads. You
> would get near optimum speedup only splitting the problem into two.
>
> The heuristic for sequential threshold is:
>
> Threshold = N / (LoadBalanceFactor * NumCore)
>
> where N is the "size" of the problem.
>
> The LoadBalanceFactor in an ideal world would be 1 but real tasks 
> encounter
> delays and execute differently so a value > 1 is needed. 
> Heuristically a
> value of 8 is a good starting value.
>
> David Holmes
>
>> -----Original Message-----
>> From: concurrency-interest-bounces at cs.oswego.edu
>> [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of
>> thurston at nomagicsoftware.com
>> Sent: Tuesday, 12 February 2013 6:52 AM
>> To: concurrency-interest at cs.oswego.edu
>> Subject: [concurrency-interest] Some interesting (confusing?) 
>> benchmark
>> results
>>
>>
>> Hello,
>>
>> I made some initial attempts at using ForkJoin framework for some
>> rather obvious recursively parallel use-cases, namely summing the
>> elements of an int[] and also sorting an int[] (randomly filled).
>>
>> The problem-sizes I used ranged from 1M - 10M.
>>
>> I really enjoy using the framework and find it relatively easy to
>> reason about my programs (if not necessarily the internals of the
>> framework).
>> But I was disappointed with the results: in both cases they were 
>> slower
>> than the corresponding Java serial implementation.
>>
>> I completely understand the principle of YMMV, and I wasn't 
>> expecting
>> 2x speedup, but especially in the case of the sort, but I was 
>> expecting
>> that ForkJoin would do at least a little better than the 
>> single-threaded
>> version.  I guess what I'm asking is: are these results surprising?
>> Does it mean I'm doing something wrong?
>>
>> I'll give a brief description of my implementation:
>> single-threaded:  Arrays.sort(int[])
>>
>> ForkJoin: I'm using a merge-sort, with THRESHOLD = 250 (arrived at 
>> by
>> trial-and-error)
>>
>> int[] compute()
>> {
>> if int[].length < THRESHOLD
>>      return insertionSort(int[])
>> left = new MergeSort(int[] //split in half)
>> right = new MergeSort(int[] //other half)
>> return merge(right.compute(), left.join())
>> }
>>
>> The insertionSort() and merge() methods are just standard
>> implementations; there is a bit of apples to oranges comparison 
>> since
>> Arrays.sort() uses an optimized quicksort, but we're still talking
>> O(nlog(n))
>>
>> Just ran it on my laptop:
>> Windows 7 64-bit
>> 1.7.0_03; Java HotSpot(TM) 64-Bit Server VM 22.1-b02
>> Core 2 Duo==> 2 cores 2GHz
>>
>> 2M int[]:
>> single-threaded: ~330ms
>> ForkJoin (2 workers): ~390ms
>>
>> Would appreciate any feedback and hopefully the #s are somewhat 
>> helpful
>> (and please no scoffawing at my antiquated machine)
>>
>> -T
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>



More information about the Concurrency-interest mailing list