[concurrency-interest] deadline and long overflow

Justin Sampson jsampson at guidewire.com
Wed Apr 20 15:26:51 EDT 2016


Peter, I had exactly the same funny feeling as you when I first saw
this idiom. A few years back, Martin changed some of my timing code
in Guava's Monitor class from something like this:

  long timeoutNanos = unit.toNanos(time);
  long startNanos = System.nanoTime();
  ...
  long remainingNanos =
      timeoutNanos - (System.nanoTime() - startNanos);

to something like this:

  long timeoutNanos = unit.toNanos(time);
  long deadline = System.nanoTime() + timeoutNanos;
  ...
  long remainingNanos = deadline - System.nanoTime();

At first I thought that this would introduce some kind of overflow
error with MAX_VALUE, but I wrote a unit test for it and it passed!
Only then did I realize that these two constructs are exactly
identical in runtime behavior due to the arithmetic wraparound of
long values. Any overflow bugs triggered by one would have been
triggered by the other just as well.

(For anyone who thinks MAX_VALUE isn't interesting because it's a
centuries-long timeout, consider someone passing MAX_VALUE into an
API with the intention of "actually, don't timeout this call." If
there were an arithmetic overflow bug that caused it to, say,
timeout immediately instead of never, that would be a very
noticeable and reproducible issue.)

Now, that's not quite the end of the story. It turns out that
there's no overflow bug for MAX_VALUE as long as nanoTime() actually
progresses forward, but there IS an overflow bug for MIN_VALUE, and
there IS an overflow bug for MAX_VALUE if nanoTime() ever goes
backward.

For the MIN_VALUE case, consider for simplicity that the first call
to nanoTime() returns 0 and the second call returns 1. Then with
either of the constructs above, remainingNanos ends up equal to
MIN_VALUE-1, which is actually MAX_VALUE. Without additional guard
clauses, that could result in blocking forever rather than not at
all, which is what MIN_VALUE should mean (as with any non-positive
timeout). The fix is simply to check for time <= 0 before doing any
of these calculations, but this bug has actually slipped into the
some parts of the JDK before -- and all my testing and agonizing
over Martin's change in Guava helped find and fix one or two such
cases in the JDK! :)

For the MAX_VALUE case, there's only an overflow bug if nanoTime()
goes backward, causing some negative value to be subtracted from
MAX_VALUE in the code above. That should never happen in HotSpot,
but there have been occasional bugs and experimental "optimizations"
in other JVMs that allow it to be seen. JDK code is NOT written to
be robust to that possibility, which would slightly complicate the
code everywhere that a timeout is checked.

There is ONE remaining overflow situation that we really can't do
anything about: When the elapsed time between two calls to
nanoTime() actually exceeds 2^63-1. I'm personally willing to ignore
that one!

Cheers,
Justin



More information about the Concurrency-interest mailing list