[concurrency-interest] Timer notification drifts

David Holmes davidcholmes at aapt.net.au
Mon Jan 10 18:12:29 EST 2011


I believe you need root access to check the clocksource (unless default
perms are modified):

cat /sys/devices/system/clocksource/clocksource0/available_clocksource

to see what's available, and

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

to see what is in use.

I was suggesting using nanoTime to measure the elapsed time, rather than
date/currentTimeMillis, as nanoTime should not be affected by time-of-day
adjustments. Which reminds me: check if ntp is running and if so turn it
off, if not turn it on, and see if that makes a difference.

David

-----Original Message-----
From: Navin Jha [mailto:navin.jha at FXALL.com]
Sent: Tuesday, 11 January 2011 9:04 AM
To: dholmes at ieee.org
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts


David,

I tried LockSupport.park(..) and ScheduledThreadPoolExecutor with same
results and they use nano time. Are you suggesting using nano time in any
other way? I will try them again after the switch to HPET to see if there is
any difference. I have communicated your concern  about the use of TSC on
multi-processor systems to our unix support group.

One more question, how do I make sure that the clock source is set up
correctly to HPET? Since it was done by the support group  I want to make
sure that they did it  correctly?

-Navin

From: David Holmes [mailto:davidcholmes at aapt.net.au]
Sent: Monday, January 10, 2011 5:41 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts

Try using System.nanoTime to track things.

BTW unless you have CPUs with reliable TSC then you should never use TSC as
the clocksource on multi-processor systems. There are some OS specific
utilities from the chip vendors to fix TSC drift but I don't know what is
available for Linux. It's been a little while since I checked on the current
state of this.

David
-----Original Message-----
From: Navin Jha [mailto:navin.jha at FXALL.com]
Sent: Tuesday, 11 January 2011 8:33 AM
To: dholmes at ieee.org
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts
David,
TSC was used by default (according to unix support folks, since I don't have
access to configs on those machine), they switched it to use HPET and I got
the same result. Below is the sample code I used to test. On one machine I
see a constant drift of about 15 milliseconds regardless of how of long the
timer runs (I ran it for 1 hr, 2 hrs and 24hrs). This machine has 2 cpus. On
the machine with 8 cpus The drift was close to 11 seconds for 24 hrs.

import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Timer;
import java.util.TimerTask;

public class TimerTest {

         public static void main(String[] args) {

               try {
                  long schedule=500;
                  if (args.length > 0)
                        schedule=Long.parseLong(args[0]);
                  System.out.println( " scheduling job\n");
                  Timer eodTimer = new Timer(true);
                  SimpleDateFormat simpleDateFormat = new
SimpleDateFormat("HH:mm:ss:S");
                  System.out.println(" Current time = " +
simpleDateFormat.format(Calendar.getInstance().getTime()));
                  eodTimer.scheduleAtFixedRate(new
TestTask(System.currentTimeMillis(),schedule, simpleDateFormat), schedule,
schedule);
                  Thread.sleep(Long.MAX_VALUE);
            } catch (Exception e) {
              System.out.println(e);
            }

      }

         private static class TestTask extends TimerTask {

          long startTime;
          long sch;
          SimpleDateFormat simpleDateFormat;
          TestTask (long startTime,long schedule, SimpleDateFormat
simpleDateFormat)
          {
            this.startTime=startTime;
            sch=schedule;
            this.simpleDateFormat = simpleDateFormat;
          }
            public void run() {
                  long now = System.currentTimeMillis();
                  System.out.println(" End time     = " +
simpleDateFormat.format(Calendar.getInstance().getTime()) + "\n");
                  System.out.println(" Actual diff  = "+(now-startTime)+ "
milliseconds, expected diff = " + sch + " milliseconds");
                  System.exit(1);

            }
         }
}

From: David Holmes [mailto:davidcholmes at aapt.net.au]
Sent: Monday, January 10, 2011 5:21 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts

So I take it that HPET is already being used.

The other possibility here is that it is not the timer that is drifting but
the time source that you are using to measure/track when the timer fires.
How are you tracking that?

David
-----Original Message-----
From: Navin Jha [mailto:navin.jha at FXALL.com]
Sent: Tuesday, 11 January 2011 8:11 AM
To: dholmes at ieee.org
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts
No luck L

Does having too many cpus effect this in anyway?

The machine on which a sample test code works only has 2 cpus while the
machine on which we see a huge drift has 8 cpus.

From: David Holmes [mailto:davidcholmes at aapt.net.au]
Sent: Monday, January 10, 2011 3:46 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: RE: [concurrency-interest] Timer notification drifts

Check what clocksource the problematic system is using. If it is TSC then
switch to HPET.

These things are difficult to diagnoze.

David Holmes
-----Original Message-----
From: concurrency-interest-bounces at cs.oswego.edu
[mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of Navin Jha
Sent: Tuesday, 11 January 2011 4:30 AM
To: Attila Szegedi
Cc: concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] Timer notification drifts
>From what I have learned so far is that this is a common problem but the
drift is too much for us since it effects our trading date rollover J

David Holmes has nice blog on clocks
(http://blogs.sun.com/dholmes/entry/inside_the_hotspot_vm_clocks) and in his
blog he suggested to someone with a similar problem that he should post the
problem here.

From: Attila Szegedi [mailto:szegedia at gmail.com]
Sent: Monday, January 10, 2011 1:24 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] Timer notification drifts

I see - you weren't specific about which API call you use, so I wanted to
root out the rookie mistake :-) Hm. a drift should definitely not occur with
scheduleAtFixedRate, as far as I can tell. At this point, this indeed start
to sound as a topic relevant for this group :-) Although I can't you help
past this stage (not much Linux system expertise), I guess whoever wants to
look into this will want the JRE, Linux, and CPU versions of your system.

On Jan 10, 2011, at 10:09 AM, Navin Jha wrote:

This is exactly what we use. A sample code we tried works fine of some linux
machines with a constant lag value (say 15 milliseconds) but fails on other
linux machines. We are trying to find if there is something about those
linux machines that causes this. The machines on which this is happening
ironically have much better hardware (high end multi-core linux servers).

From: Attila Szegedi [mailto:szegedia at gmail.com]
Sent: Monday, January 10, 2011 1:03 PM
To: Navin Jha
Cc: concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] Timer notification drifts

scheduleAtFixedRate should help:
<http://download.oracle.com/javase/1.4.2/docs/api/java/util/Timer.html#sched
uleAtFixedRate(java.util.TimerTask,%20java.util.Date,%20long)>

On Jan 10, 2011, at 9:32 AM, Navin Jha wrote:

Hi,

Not sure if this is the right place to post this problem. We use
java.util.Timer class for a notification that needs to happens every 24
hours. We noticed that on some linux multi-core servers the notification
occurs almost 11 seconds later. If we run for successive smaller durations
say 1 hour, 2 hours, 3 hours. we notice that the lag does accumulate. So for
1 hour it is 600 milliseconds, for 2 hours it is 1.2 seconds etc..

The only solutions we can think of right now is to run the timer for smaller
duration and restart it after that duration.

Is there a solution/workaround for this problem?

Regards,
Navin




More information about the Concurrency-interest mailing list