[concurrency-interest] Timer notification drifts

Navin Jha navin.jha at FXALL.com
Tue Jan 18 16:46:10 EST 2011


Why is that? It seems to work fine.

-----Original Message-----
From: David Holmes [mailto:davidcholmes at aapt.net.au]
Sent: Tuesday, January 18, 2011 4:44 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts

> The machine is using jiffies.

Strewth! That's not what you want! Get it changed to hpet.

David

> -----Original Message-----
> From: David Holmes [mailto:davidcholmes at aapt.net.au]
> Sent: Saturday, January 15, 2011 4:50 AM
> To: Navin Jha
> Cc: concurrency-interest at cs.oswego.edu
> Subject: RE: [concurrency-interest] Timer notification drifts
>
> Navin Jha writes:
> > We upgraded one of our box to 5.4 linux as you suggested and
> that did it!
> >
> > We did't even have to change any setting.
>
> Out of interest what clocksource is the updated system using?
>
> David
>
>
> > Thank you so much!
> >
> > Regards,
> > Navin
> >
> >
> > -----Original Message-----
> > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > Sent: Tuesday, January 11, 2011 6:04 PM
> > To: Navin Jha
> > Cc: concurrency-interest at cs.oswego.edu
> > Subject: RE: [concurrency-interest] Timer notification drifts
> >
> > > We don't see these files, is there another way to check this?
> >
> > As Mark says your kernel may be too old to expose these.
> >
> > > Also this the log that might tell you more:
> > > [11:16:23][root at numpintapp15:~]$ grep -i tsc /var/log/messages
> > > Jan 10 16:51:18 numpintapp15 checking TSC synchronization across
> > > 12 CPUs: passed.
> >
> > I'm very skeptical that an old kernel will be able to
> synchronize the TSC
> > across 12 cores. Linux abandoned use of the TSC as the primary
> timesource
> > for MP systems.
> >
> > http://lwn.net/Articles/209101/
> >
> > But maybe things have progressed since then. I know Solaris
> went to great
> > lengths to do the TSC synchronization, and while it seems
> reasonable today
> > there were a few bumps along the road.
> >
> > > We've tried to specify the clocksource in the
> > > /boot/grub/grub.conf file.  We've been trying various
> > > clocksources. Would you let us know how to set the appropriate
> > > clocksource? Below is the grub.conf file.
> >
> > I'm not a linux expert but it appears that clock=hpet as the appropriate
> > boot option should work. These log entries indicate tsc is used:
> >
> > Jan 10 18:44:16 numpintapp15 Bootdata ok (command line is ro
> root=LABEL=/
> > hda=ide-scsi hpet=disable pnpacpi=off clock=tsc)
> > Jan 10 18:44:16 numpintapp15 Kernel command line: ro root=LABEL=/
> > hda=ide-scsi hpet=disable pnpacpi=off clock=tsc console=tty0
> >
> > David
> > ------
> >
> > > [11:17:43][root at numpintapp15:~]$ cat /boot/grub/grub.conf
> > > # grub.conf generated by anaconda
> > > #
> > > # Note that you do not have to rerun grub after making changes to
> > > this file
> > > # NOTICE:  You have a /boot partition.  This means that
> > > #          all kernel and initrd paths are relative to /boot/, eg.
> > > #          root (hd0,0)
> > > #          kernel /vmlinuz-version ro root=/dev/cciss/c0d0p6
> > > #          initrd /initrd-version.img
> > > #boot=/dev/cciss/c0d0
> > > default=1
> > > timeout=5
> > > splashimage=(hd0,0)/grub/splash.xpm.gz
> > > hiddenmenu
> > > title Red Hat Enterprise Linux AS (2.6.9-78.0.8.EL)
> > >         root (hd0,0)
> > >         kernel /vmlinuz-2.6.9-78.0.8.EL ro root=LABEL=/
> > > hda=ide-scsi clock=hpet clocksource=hpet
> > >         initrd /initrd-2.6.9-78.0.8.EL.img
> > > title Red Hat Enterprise Linux AS (2.6.9-78.ELlargesmp)
> > >         root (hd0,0)
> > >         kernel /vmlinuz-2.6.9-78.ELlargesmp ro root=LABEL=/
> > > hda=ide-scsi hpet=disable pnpacpi=off clock=tsc
> > >         initrd /initrd-2.6.9-78.ELlargesmp.img
> > > title Red Hat Enterprise Linux AS-up (2.6.9-78.EL)
> > >         root (hd0,0)
> > >         kernel /vmlinuz-2.6.9-78.EL ro root=LABEL=/ hda=ide-scsi
> > > clock=hpet clocksource=hpet
> > >         initrd /initrd-2.6.9-78.EL.img
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > > Sent: Monday, January 10, 2011 6:12 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > >
> > > I believe you need root access to check the clocksource
> (unless default
> > > perms are modified):
> > >
> > > cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > >
> > > to see what's available, and
> > >
> > > cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > >
> > > to see what is in use.
> > >
> > > I was suggesting using nanoTime to measure the elapsed time,
> rather than
> > > date/currentTimeMillis, as nanoTime should not be affected by
> > time-of-day
> > > adjustments. Which reminds me: check if ntp is running and if
> so turn it
> > > off, if not turn it on, and see if that makes a difference.
> > >
> > > David
> > >
> > > -----Original Message-----
> > > From: Navin Jha [mailto:navin.jha at FXALL.com]
> > > Sent: Tuesday, 11 January 2011 9:04 AM
> > > To: dholmes at ieee.org
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > >
> > >
> > > David,
> > >
> > > I tried LockSupport.park(..) and ScheduledThreadPoolExecutor with same
> > > results and they use nano time. Are you suggesting using nano
> > time in any
> > > other way? I will try them again after the switch to HPET to see
> > > if there is
> > > any difference. I have communicated your concern  about the use
> > of TSC on
> > > multi-processor systems to our unix support group.
> > >
> > > One more question, how do I make sure that the clock source is set up
> > > correctly to HPET? Since it was done by the support group  I
> > want to make
> > > sure that they did it  correctly?
> > >
> > > -Navin
> > >
> > > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > > Sent: Monday, January 10, 2011 5:41 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > >
> > > Try using System.nanoTime to track things.
> > >
> > > BTW unless you have CPUs with reliable TSC then you should never
> > > use TSC as
> > > the clocksource on multi-processor systems. There are some OS specific
> > > utilities from the chip vendors to fix TSC drift but I don't
> > know what is
> > > available for Linux. It's been a little while since I checked on
> > > the current
> > > state of this.
> > >
> > > David
> > > -----Original Message-----
> > > From: Navin Jha [mailto:navin.jha at FXALL.com]
> > > Sent: Tuesday, 11 January 2011 8:33 AM
> > > To: dholmes at ieee.org
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > > David,
> > > TSC was used by default (according to unix support folks, since I
> > > don't have
> > > access to configs on those machine), they switched it to use HPET
> > > and I got
> > > the same result. Below is the sample code I used to test. On
> > one machine I
> > > see a constant drift of about 15 milliseconds regardless of how
> > > of long the
> > > timer runs (I ran it for 1 hr, 2 hrs and 24hrs). This machine has
> > > 2 cpus. On
> > > the machine with 8 cpus The drift was close to 11 seconds for 24 hrs.
> > >
> > > import java.text.SimpleDateFormat;
> > > import java.util.Calendar;
> > > import java.util.Timer;
> > > import java.util.TimerTask;
> > >
> > > public class TimerTest {
> > >
> > >          public static void main(String[] args) {
> > >
> > >                try {
> > >                   long schedule=500;
> > >                   if (args.length > 0)
> > >                         schedule=Long.parseLong(args[0]);
> > >                   System.out.println( " scheduling job\n");
> > >                   Timer eodTimer = new Timer(true);
> > >                   SimpleDateFormat simpleDateFormat = new
> > > SimpleDateFormat("HH:mm:ss:S");
> > >                   System.out.println(" Current time = " +
> > > simpleDateFormat.format(Calendar.getInstance().getTime()));
> > >                   eodTimer.scheduleAtFixedRate(new
> > > TestTask(System.currentTimeMillis(),schedule,
> > simpleDateFormat), schedule,
> > > schedule);
> > >                   Thread.sleep(Long.MAX_VALUE);
> > >             } catch (Exception e) {
> > >               System.out.println(e);
> > >             }
> > >
> > >       }
> > >
> > >          private static class TestTask extends TimerTask {
> > >
> > >           long startTime;
> > >           long sch;
> > >           SimpleDateFormat simpleDateFormat;
> > >           TestTask (long startTime,long schedule, SimpleDateFormat
> > > simpleDateFormat)
> > >           {
> > >             this.startTime=startTime;
> > >             sch=schedule;
> > >             this.simpleDateFormat = simpleDateFormat;
> > >           }
> > >             public void run() {
> > >                   long now = System.currentTimeMillis();
> > >                   System.out.println(" End time     = " +
> > > simpleDateFormat.format(Calendar.getInstance().getTime()) + "\n");
> > >                   System.out.println(" Actual diff  =
> > "+(now-startTime)+ "
> > > milliseconds, expected diff = " + sch + " milliseconds");
> > >                   System.exit(1);
> > >
> > >             }
> > >          }
> > > }
> > >
> > > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > > Sent: Monday, January 10, 2011 5:21 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > >
> > > So I take it that HPET is already being used.
> > >
> > > The other possibility here is that it is not the timer that is
> > > drifting but
> > > the time source that you are using to measure/track when the
> > timer fires.
> > > How are you tracking that?
> > >
> > > David
> > > -----Original Message-----
> > > From: Navin Jha [mailto:navin.jha at FXALL.com]
> > > Sent: Tuesday, 11 January 2011 8:11 AM
> > > To: dholmes at ieee.org
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > > No luck L
> > >
> > > Does having too many cpus effect this in anyway?
> > >
> > > The machine on which a sample test code works only has 2 cpus
> while the
> > > machine on which we see a huge drift has 8 cpus.
> > >
> > > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > > Sent: Monday, January 10, 2011 3:46 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: RE: [concurrency-interest] Timer notification drifts
> > >
> > > Check what clocksource the problematic system is using. If it
> > is TSC then
> > > switch to HPET.
> > >
> > > These things are difficult to diagnoze.
> > >
> > > David Holmes
> > > -----Original Message-----
> > > From: concurrency-interest-bounces at cs.oswego.edu
> > > [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of
> > Navin Jha
> > > Sent: Tuesday, 11 January 2011 4:30 AM
> > > To: Attila Szegedi
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: Re: [concurrency-interest] Timer notification drifts
> > > From what I have learned so far is that this is a common
> problem but the
> > > drift is too much for us since it effects our trading date rollover J
> > >
> > > David Holmes has nice blog on clocks
> > > (http://blogs.sun.com/dholmes/entry/inside_the_hotspot_vm_clocks)
> > > and in his
> > > blog he suggested to someone with a similar problem that he
> > > should post the
> > > problem here.
> > >
> > > From: Attila Szegedi [mailto:szegedia at gmail.com]
> > > Sent: Monday, January 10, 2011 1:24 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: Re: [concurrency-interest] Timer notification drifts
> > >
> > > I see - you weren't specific about which API call you use, so I
> > wanted to
> > > root out the rookie mistake :-) Hm. a drift should definitely not
> > > occur with
> > > scheduleAtFixedRate, as far as I can tell. At this point, this
> > > indeed start
> > > to sound as a topic relevant for this group :-) Although I
> > can't you help
> > > past this stage (not much Linux system expertise), I guess
> > > whoever wants to
> > > look into this will want the JRE, Linux, and CPU versions of
> > your system.
> > >
> > > On Jan 10, 2011, at 10:09 AM, Navin Jha wrote:
> > >
> > > This is exactly what we use. A sample code we tried works fine of
> > > some linux
> > > machines with a constant lag value (say 15 milliseconds) but
> > > fails on other
> > > linux machines. We are trying to find if there is something
> about those
> > > linux machines that causes this. The machines on which this
> is happening
> > > ironically have much better hardware (high end multi-core linux
> > servers).
> > >
> > > From: Attila Szegedi [mailto:szegedia at gmail.com]
> > > Sent: Monday, January 10, 2011 1:03 PM
> > > To: Navin Jha
> > > Cc: concurrency-interest at cs.oswego.edu
> > > Subject: Re: [concurrency-interest] Timer notification drifts
> > >
> > > scheduleAtFixedRate should help:
> > > <http://download.oracle.com/javase/1.4.2/docs/api/java/util/Timer.
> > html#sched
> > uleAtFixedRate(java.util.TimerTask,%20java.util.Date,%20long)>
> >
> > On Jan 10, 2011, at 9:32 AM, Navin Jha wrote:
> >
> > Hi,
> >
> > Not sure if this is the right place to post this problem. We use
> > java.util.Timer class for a notification that needs to happens every 24
> > hours. We noticed that on some linux multi-core servers the notification
> > occurs almost 11 seconds later. If we run for successive
> smaller durations
> > say 1 hour, 2 hours, 3 hours. we notice that the lag does
> > accumulate. So for
> > 1 hour it is 600 milliseconds, for 2 hours it is 1.2 seconds etc..
> >
> > The only solutions we can think of right now is to run the timer
> > for smaller
> > duration and restart it after that duration.
> >
> > Is there a solution/workaround for this problem?
> >
> > Regards,
> > Navin
> >
> >
> >
> >
>
>
>





More information about the Concurrency-interest mailing list