[concurrency-interest] Multi-core testing, help with findings

David Holmes dcholmes at optusnet.com.au
Mon Dec 11 19:31:18 EST 2006


I've assumed the platform is Windows, but if it is linux then that opens
other possibilities. The problem can be explained if the busy-wait thread
doesn't get descheduled (which is easy to test by changing it to not be a
busy-wait). The issue as to why it doesn't get descheduled is then the
interesting part. I suspect an OS scheduling quirk on multi-core, but need
more information.

Cheers,
David Holmes

> -----Original Message-----
> From: Boehm, Hans [mailto:hans.boehm at hp.com]
> Sent: Tuesday, 12 December 2006 10:14 AM
> To: dholmes at ieee.org; David Harrigan; concurrency-interest at cs.oswego.edu
> Subject: RE: [concurrency-interest] Multi-core testing, help with
> findings
>
>
> Somehow that doesn't look like the whole explanation to me.  If I read
> the code correctly, finished is only being touched once by another
> thread for each major iteration.  Thus it should only leave the L1 cache
> of the main thread once every 7 seconds.  It's unclear to me why the
> main thread should be touching the memory system significantly at all.
> It's also unclear to me why it should be scheduled all the time, instead
> of just being 1 of 1001 threads.
>
> Depending on the platform, might the thread creation cost just be a lot
> higher?  Or might you get several instances of the counter variable in
> the same cache line?  Neither of those sounds all that likely, either
> ...
>
> Hans
>
> > -----Original Message-----
> > From: concurrency-interest-bounces at cs.oswego.edu
> > [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf
> > Of David Holmes
> > Sent: Monday, December 11, 2006 2:24 PM
> > To: David Harrigan; concurrency-interest at cs.oswego.edu
> > Subject: Re: [concurrency-interest] Multi-core testing, help
> > with findings
> >
> > David,
> >
> > You have a busy wait-loop which will try to consume
> > 1-CPU/CORE and continually bang on the "finished" variable,
> > doing nothing but interfere with the execution of the real
> > work due to memory/cache traffic. On a single processor
> > system your busy thread will get switched out after each
> > timeslice and get far less CPU time to interfere.
> >
> > So I think what you are seeing here is a scheduling artifact
> > of the OS.
> >
> > Cheers,
> > David Holmes
> >
> > > -----Original Message-----
> > > From: concurrency-interest-bounces at cs.oswego.edu
> > > [mailto:concurrency-interest-bounces at cs.oswego.edu]On
> > Behalf Of David
> > > Harrigan
> > > Sent: Monday, 11 December 2006 10:32 PM
> > > To: concurrency-interest at cs.oswego.edu
> > > Subject: Re: [concurrency-interest] Multi-core testing, help with
> > > findings
> > >
> > >
> > >
> > > Hi,
> > >
> > > Oops, that should be after 20 runs on the Pentium-M...not 5!!
> > >
> > > Also, I'm using JDK 6 final - the one that was released today.
> > >
> > > -=david=-
> > >
> > >
> > > David Harrigan wrote:
> > > >
> > > > Hi All,
> > > >
> > > > I've recently acquired a nice new shiny core 2 duo (2 x 2.0Ghz)
> > > laptop and
> > > > I thought I would try out a test of threading in it. So, I
> > > wrote a simple
> > > > class (see below). However, my findings are curious and I
> > would like
> > > > if possible someone to explain why they are slower on my
> > multi-core
> > > > system than my older system which was a Pentium-M @ 2.33Ghz. Both
> > > machines, apart
> > > > from the processor are near enough identical - same disk speed,
> > > same type
> > > > of memory (667Mhz DDR2 2GB) etc..
> > > >
> > > > After 20 runs of my program on the core 2 duo, the
> > average time was :
> > > > 6975ms
> > > > After 5 runs of my program on the Pentium-M, the average time
> > > was : 2735m
> > > >
> > > > I suspect it's because with two processors they are both
> > contending
> > > > for main memory. Notice that I have the counter as volatile which
> > > > forces the variable to flush out to memory each time -
> > since this is
> > > > what I'm interested in testing - real world stuff where
> > things are
> > > > synch'ed (when it wasn't volatile, the change was
> > dramatic - because
> > > > the core 2 duo has 4MB of cache it was extremely fast,
> > whereas the
> > > > Pentium-M with
> > > only 1MB of
> > > > cache was a lot lot slower)...
> > > >
> > > >
> > > >
> > > >
> > > > import java.util.concurrent.BrokenBarrierException;
> > > > import java.util.concurrent.CyclicBarrier;
> > > >
> > > > public class ThreadTest {
> > > >
> > > >     private static final int howMany = 1000;
> > > >     private static volatile boolean finished;
> > > >     final CyclicBarrier barrier = new CyclicBarrier(howMany, new
> > > > Runnable() {
> > > >         public void run() {
> > > >             finished = true;
> > > >         }
> > > >     });
> > > >
> > > >     public static void main(String[] args) {
> > > >         ThreadTest t = new ThreadTest();
> > > >         long total = 0;
> > > >         for(int i = 0 ; i < 20 ; i ++) {
> > > >             long elapsedTime = t.doIt();
> > > >             total += elapsedTime;
> > > >             System.out.println("Run #" + i + " : elapsed
> > time = " +
> > > > elapsedTime + "ms");
> > > >         }
> > > >         System.out.println("Average time = " + (total /
> > 20) + "ms");
> > > >     }
> > > >
> > > >     private long doIt() {
> > > >         long startTime = System.currentTimeMillis();
> > > >         for(int i = 0; i < howMany; i++) {
> > > >             new Thread(new Worker()).start();
> > > >         }
> > > >         while(!finished);
> > > >         long endTime = System.currentTimeMillis();
> > > >         return (endTime - startTime);
> > > >
> > > >     }
> > > >
> > > >     class Worker implements Runnable {
> > > >         volatile int counter;
> > > >         public void run() {
> > > >             for(counter = 0 ; counter < 1000000 ; counter++);
> > > >             try {
> > > >                 barrier.await();
> > > >             } catch(InterruptedException e) {
> > > >                 return;
> > > >             } catch(BrokenBarrierException e) {
> > > >                 return;
> > > >             }
> > > >         }
> > > >     }
> > > > }
> > > >
> > > >
> > > > -=david=-
> > > >
> > >
> > > --
> > > View this message in context:
> > > http://www.nabble.com/Multi-core-testing%2C-help-with-findings-tf2
> > 793302.html#a7793847
> > Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
> >
> > _______________________________________________
> > Concurrency-interest mailing list
> > Concurrency-interest at altair.cs.oswego.edu
> > http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
> >
> > _______________________________________________
> > Concurrency-interest mailing list
> > Concurrency-interest at altair.cs.oswego.edu
> > http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
> >
>



More information about the Concurrency-interest mailing list