[concurrency-interest] Some interesting (confusing?) benchmark results
thurston at nomagicsoftware.com
thurston at nomagicsoftware.com
Mon Feb 11 16:49:18 EST 2013
Well, it's the algorithm that's dispositive in this case.
Insertion-sort is empirically better than mergesort up to around
25-sized arrays. That's where I started, and I played around with
different thresholds and 250 seemed to be the best.
Anything larger than 1000 you start to see significant degradation.
I tried your formula (approximately) with N = 2 000 000 ==> 250K
That brought my laptop to its knees (as expected): over 33K ms as
opposed to 390ms with 250.
Following my own admonition (the algorithm matters), it occurs to me
that I just use my own serial mergesort as comparison to the Forkjoin
(as opposed to Arrays.sort). I'll let y'all know the results of that
On 2013-02-11 13:19, David Holmes wrote:
> Your sequential threshold is far too small for only two worker
> threads. You
> would get near optimum speedup only splitting the problem into two.
> The heuristic for sequential threshold is:
> Threshold = N / (LoadBalanceFactor * NumCore)
> where N is the "size" of the problem.
> The LoadBalanceFactor in an ideal world would be 1 but real tasks
> delays and execute differently so a value > 1 is needed.
> Heuristically a
> value of 8 is a good starting value.
> David Holmes
>> -----Original Message-----
>> From: concurrency-interest-bounces at cs.oswego.edu
>> [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of
>> thurston at nomagicsoftware.com
>> Sent: Tuesday, 12 February 2013 6:52 AM
>> To: concurrency-interest at cs.oswego.edu
>> Subject: [concurrency-interest] Some interesting (confusing?)
>> I made some initial attempts at using ForkJoin framework for some
>> rather obvious recursively parallel use-cases, namely summing the
>> elements of an int and also sorting an int (randomly filled).
>> The problem-sizes I used ranged from 1M - 10M.
>> I really enjoy using the framework and find it relatively easy to
>> reason about my programs (if not necessarily the internals of the
>> But I was disappointed with the results: in both cases they were
>> than the corresponding Java serial implementation.
>> I completely understand the principle of YMMV, and I wasn't
>> 2x speedup, but especially in the case of the sort, but I was
>> that ForkJoin would do at least a little better than the
>> version. I guess what I'm asking is: are these results surprising?
>> Does it mean I'm doing something wrong?
>> I'll give a brief description of my implementation:
>> single-threaded: Arrays.sort(int)
>> ForkJoin: I'm using a merge-sort, with THRESHOLD = 250 (arrived at
>> int compute()
>> if int.length < THRESHOLD
>> return insertionSort(int)
>> left = new MergeSort(int //split in half)
>> right = new MergeSort(int //other half)
>> return merge(right.compute(), left.join())
>> The insertionSort() and merge() methods are just standard
>> implementations; there is a bit of apples to oranges comparison
>> Arrays.sort() uses an optimized quicksort, but we're still talking
>> Just ran it on my laptop:
>> Windows 7 64-bit
>> 1.7.0_03; Java HotSpot(TM) 64-Bit Server VM 22.1-b02
>> Core 2 Duo==> 2 cores 2GHz
>> 2M int:
>> single-threaded: ~330ms
>> ForkJoin (2 workers): ~390ms
>> Would appreciate any feedback and hopefully the #s are somewhat
>> (and please no scoffawing at my antiquated machine)
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
More information about the Concurrency-interest