[concurrency-interest] "if (!cond) wait()" is considered harmful in Java

Szabolcs Ferenczi szabolcs.ferenczi at gmail.com
Sat May 12 17:08:32 EDT 2007


I am discussing the "if (!cond) wait)" problem in Java off list and I
came to some interesting conclusion that I am going to share with the
concurrency-interest list.

As all of us know "if (!cond) wait)" is considered harmful in Java.
The reason why is not really known. Well, the reason is the semantics
of the "wait()" operation. One problem with it is the spurious wakeup.

Normally, as it is defined for a correct monitor, the wait should
delay the process until the signal comes for that condition:

  wait() := delay until (signal)

Consequently, when the wait call terminates, the condition is
established for the progress, hence the signal.

  if (!cond) wait();
  assertTrue(cond);

Unfortunately, in Java the semantics of the "wait()" is different:

  wait() := delay until (signal OR spurious wakeup)

It follows that when the wait call terminates, either the condition is
established for the progress or there was a spurious wakeup. It must
be checked. An obvious check is to check the condition again: if the
condition still holds, it was a spurious wakeup, so the process must
be delayed for some more time.

  while (!cond) wait();
  assertTrue(cond);

The "while" is there because of the spurious wakeups and the "(!cond)"
checks for the other cause, i.e. if the valid signal arrived.

Practically, any Java algorithm should stay valid if you replace the
"wait()" with a timed out version of the wait with the minimal timeout
value, e.g. "wait(1)". Just check it:

  if (!cond) wait(1);
  assertTrue(cond);

No, it will not pass the test. But if while is used, it passes:

  while (!cond) wait(1);
  assertTrue(cond);

Yes, it is ok.

Now, the timed wait has one more factor for termination. Normally, it
may terminate for either of the two valid reasons:

  wait(t) := delay until (signal OR timeout signal)

In Java, it is more complicated again:

  wait(t) := delay until (signal OR timeout signal OR spurious wakeup)

Let us assume we are implementing the timeout version of a put
operation of a bounded buffer. With the correct wait semantics it
could look as simple as this:

1 boolean synchronized put(Object item, long timeout) {
2   if (full) { wait(timeout); }
3   if (full) { return false; }
4   insert(item);
5   return true;
6 }

Note that when the "wait(timeout)" returns, one of the conditions holds:

(1) either signal arrived due to a "get()" operation
(2) or timeout signal arrived

The selection of the two is handled by the second "if (full)" clause
at line 3. It selects between the two events that might have happened.

Now, with the Java semantics, the situation is more complex because
instead of for two reasons, the wait might terminate for three
reasons, i.e. the spurious wakeup is added to the reasons. First of
all we must replace the first if statements at line 2 with a while to
prepare for the spurious wakeups and extend the condition with the
timing wait tag.

2  while (full || <timed>) { wait(timeout); }

Then the effect of the spurious wakeup on the timeout value must be
compensated. We have to maintain the remaining time in the argument of
the repeated wait just because the spurious wakeup:

2a  long timeToGo = timeout;
2b  while (full || <timed>) { wait(timeToGo); timeToGo = <remaining time>;}

To maintain it, we have to calculate the remaining time and extend the
expression of the while statement:

2a  final long limit = currentTimeMillis() + timeout;
2b  long timeToGo = limit - currentTimeMillis();
2c  while (full || timeToGo > 0) {
2d    wait(timeToGo);
2e    timeToGo = limit - currentTimeMillis();
2f  }

Now we have three elements in the code for the three possible causes
for the wait to terminate:

1. while loop for the spurious return
2. check on full for the valid signal on the condition variable
3. timeToGo > 0 check for the timeout signal

The method looks like this now:

1  boolean synchronized put(Object item, long timeout) {
2a   final long limit = currentTimeMillis() + timeout;
2b   long timeToGo = limit - currentTimeMillis();
2c   while (full && timeToGo > 0) {
2d     wait(timeToGo);
2e     timeToGo = limit - currentTimeMillis();
2f   }
3    if (full) { return false; }
4    insert(item);
5    return true;
6  }

Six additional lines were needed to cope with the extra "feature" of
the Java wait, namely, that it can return spuriously. This means less
efficiency, more complexity and more chance for programming errors. Of
course there are several optimisation possibilities on this piece of
code, such as:

1. The check for the timeout termination (line 3) can be moved inside
the while loop
2. The calls to the currentTimeMillis() can be minimised

Anyhow, the example shows the difference in complexity that had to be
introduced just because the "wait()" is not implemented properly in
Java. This indicates that a little bug fix in the implementation of
the "wait()" operation could help considerably without having any
backwards compatibility problem.

Without the bug fix of the "wait()", however, the "if (!cond) wait()"
is considered harmful in Java and should be flagged with an error by
the compiler.

Best Regards,
Szabolcs


More information about the Concurrency-interest mailing list