[concurrency-interest] Strange behaviour when sorting an uniform array

Michael Spiegel michael.m.spiegel at gmail.com
Tue May 25 10:05:17 EDT 2010


I think the benchmark would benefit from a warmup phase.  It would
ensure the methods in question are JIT compiled. Stackoverflow has a
reasonable set of links on how to write a Java microbenchmark:
http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java.

--Michael

On Tue, May 25, 2010 at 7:00 AM, Lukas Krecan <lukas.krecan at gmail.com> wrote:
> Sure, the synchronized statement is just a leftover from previous
> experiments.
>
> I have tried the same test with standard java.util.Arrays.sort and I am
> getting
>
> 1) Uniform array (full of 1L) ~ 200ms
> 2) rand.nextInt(20) ~ 1400ms
> 3) rand.nextLong() ~ 7700ms
>
> The reason might be that a modified quicksort algorithm is used there. Would
> it be possible to use the same algorithm for ParallelArray? I am afraid that
> most programmers have already forgotten the details of sorting algorithms
> and would be as surprised as I was.
>      Cheers
>           Lukas
>
> On 05/25/2010 11:44 AM, Kasper Nielsen wrote:
>>
>> Ahh, got it. You still want to remove that synchronized statement for
>> serious usage though.
>>
>> What you are seing is the worst case behavior for quicksort which
>> ParallelArray is using under the hood. All equal elements will kill most
>> quicksort implementations. Instead of the average running time of n*logn you
>> are seing worst case behaviour of n*n.
>>
>> Even large datasets with just a few different values will slow most
>> implementations down significantly.
>> I tried replacing
>>  return 1L;//rand.nextLong();
>> with
>>  return rand.nextInt(20)
>> and PA was still around 70 times slower compared to using rand.nextLong().
>>
>> We might want to switch to the dual pivot implementation introduced in
>> OpenJDK 7. Don't know about licensing issues though.
>>
>> - Kasper
>>
>>
>> On 25/5/2010 15:10, Lukas Krecan wrote:
>>>
>>> Hi,
>>>    I am quite sure that it has nothing to do the synchronization. The
>>> problem remains even if synchronized keyword is removed. Besides, the
>>> method was used only for generating values, which works well. Problem is
>>> with sorting. It works well on random values, but runs for ages on
>>> uniform array.
>>>
>>>
>>> On Tue, May 25, 2010 at 10:00 AM, Kasper Nielsen <kasper at kav.dk
>>> <mailto:kasper at kav.dk>> wrote:
>>>
>>>    Hi Lukas,
>>>
>>>    try removing the synchronized keyword in your op.
>>>
>>>    Taken from the extra166y.Ops javadoc:
>>>
>>>    In addition to stated signatures, implementations of these
>>>    interfaces must work safely in parallel. In general, this means
>>>    methods should operate only on their arguments, and should not rely
>>>    on ThreadLocals, unsafely published globals, or other unsafe
>>>    constructions. Additionally, they should not block waiting for
>>>    synchronization.
>>>
>>>    If you want to return random numbers you should use the
>>>    ThreadLocalRandom.
>>>
>>>
>>>    - Kasper
>>>
>>>
>>>    On 25/5/2010 07:9, Lukas Krecan wrote:
>>>
>>>        Hi,
>>>            I have been playing with extra166y library and found that
>>>        when I try
>>>        to sort an uniform array (full of ones) it takes ages. With older
>>>        version of the library it even threw StackOverflowError. After
>>>        update it
>>>        runs for minutes (I always kill the process after while so I do
>>>        not know
>>>        if it ends successfully).
>>>
>>>        Probably it's nothing new for you, maybe it's expected
>>>        behaviour. But
>>>        for me it's confusing and I just wanted to let you know about
>>>        the issue
>>>        in case you were not aware of it.
>>>               Best regards
>>>                 Lukas Krecan
>>>
>>>        -------------------------------------
>>>        package net.krecan.forkjoin;
>>>
>>>        import java.util.Random;
>>>
>>>        import jsr166y.ForkJoinPool;
>>>        import extra166y.Ops;
>>>        import extra166y.ParallelLongArray;
>>>
>>>
>>>        public class SortTest {
>>>             private static final int THREADS = 2;
>>>             private static final int SIZE = 40000000;
>>>
>>>             public static void testSort()
>>>             {
>>>                 ForkJoinPool fjPool = new ForkJoinPool(THREADS);
>>>                 ParallelLongArray pa =
>>> ParallelLongArray.createEmpty(SIZE,
>>>        fjPool);
>>>                 createData(pa);
>>>                 System.out.println("Sorting "+pa.summary());
>>>                 long start = System.currentTimeMillis();
>>>                 pa.sort();
>>>                 System.out.println("Done in
>>>        "+(System.currentTimeMillis()-start)+"ms.");
>>>                 System.out.println(pa.summary());
>>>
>>>
>>>             }
>>>             private static void createData(ParallelLongArray pa) {
>>>                 long start = System.currentTimeMillis();
>>>                 System.out.println("Creating data using "+THREADS+"
>>>        threads.");
>>>                 pa.setLimit(SIZE);
>>>                 final Random rand = new Random();
>>>                 pa.replaceWithGeneratedValue(new Ops.LongGenerator(){
>>>                     public synchronized long op() {
>>>                         return 1L;//rand.nextLong();
>>>                     }
>>>                 });
>>>                 System.out.println("Done in
>>>        "+(System.currentTimeMillis()-start)+"ms.");
>>>             }
>>>
>>>             public static void main(String[] args) {
>>>                 testSort();
>>>             }
>>>
>>>        }
>>>
>>>
>>>
>>>        _______________________________________________
>>>        Concurrency-interest mailing list
>>>        Concurrency-interest at cs.oswego.edu
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>>        http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>>    _______________________________________________
>>>    Concurrency-interest mailing list
>>>    Concurrency-interest at cs.oswego.edu
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>>    http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>



More information about the Concurrency-interest mailing list