[concurrency-interest] AtomicXXX.lazySet and happens-before reasoning

Boehm, Hans hans.boehm at hp.com
Sun Oct 9 22:46:44 EDT 2011


I agree with most of this.  However, the reason for the StoreLoad fence following a volatile store is typically  to prevent reordering of the store with a subsequent volatile load of a DIFFERENT VOLATILE variable.   For example, consider the standard Dekker's algorithm example, where everything is initially zero:

Thread 1:
x = 1;
r1 = y;

Thread 2:
y = 1;
r2 = x;

If x and y are volatile, r1 = r2 = 0 is not allowed.  This means that the two accesses in each thread may not be reordered.  My understanding is that if lazySet is used here, r1 = r2 = 0 is allowed, and hence the trailing StoreLoad fence is not required.

A lot of the details here are architecture specific.  In particular, the PowerPC story is quite different and more complicated, since PowerPC doesn't by default guarantee that stores become visible on all other processors at the same time.  Fences are also required to effectively obtain that guarantee.  The difference between lazySet and a volatile store is probably larger on x86 than on most other architectures.

Hans

From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
Sent: Sunday, October 09, 2011 3:53 PM
To: Ruslan Cheremin
Cc: Boehm, Hans; concurrency-interest at cs.oswego.edu
Subject: Re: AtomicXXX.lazySet and happens-before reasoning

That is not clear for me -- doesn't strict volatile store prevents any reordering with subsequent memory actions? Huns, you've just present an example of such allowed reordering.

A volatile store does not prevent subsequent non-volatile store/loads from reordering with it.  A StoreLoad is placed after the volatile store only if followed by a volatile load of the same memory location that was stored (this is to avoid the load from being satisfied out of the processor's write buffer before the prior store is made globally visible).

Based on this discussion, it sounds like lazySet does not prevent subsequent instructions from reordering with it.  Specifically, if a lazySet is followed by a volatile load, no StoreLoad is issued in between and the processor can fetch the data from the write buffer before other processors see the write; this is the reason that lazySet will be much cheaper than volatile write because it doesn't have to issue a fence instruction here and wait for the store buffer to drain to the cache.  In my own observation, lazySet translates to a regular MOV instruction on x86 (64) and a volatile write issues a locked add instruction after the write (or mfence on older Hotspot versions), which obtains the required serialization.


On Sun, Oct 9, 2011 at 6:03 PM, Ruslan Cheremin <cheremin at gmail.com<mailto:cheremin at gmail.com>> wrote:

The farther, the better :)

Yes, from this atomic package javadoc specification it is clear that it is not only StoreStore, but also LoadStore barriers before lazySet.

Interestingly that my meeting with lazySet started from informal definition like "non-volatile write to volatile variable with few additional ordering constraints", but for now it is "has the memory effects of writing (assigning) a volatile variable except..."


That is not clear for me -- doesn't strict volatile store prevents any reordering with subsequent memory actions? Huns, you've just present an example of such allowed reordering.

It seems like it somekind related to StoreLoad fence which ordinary volatile store must has after it (although I still do not understand why), but lazySet omit?


The specification states

*         lazySet has the memory effects of writing (assigning) a volatile variable except that it permits reorderings with subsequent (but not previous) memory actions that do not themselves impose reordering constraints with ordinary non-volatile writes. Among other usage contexts, lazySet may apply when nulling out, for the sake of garbage collection, a reference that is never accessed again.
in the java.util.concurrent description, which implies that it may not be reordered with previous "memory actions", not just stores.  Doug can comment more authoritatively on the intent, but that specification seems fairly unambiguous in this particular respect.

Hans



From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
Sent: Saturday, October 08, 2011 10:11 AM
To: Boehm, Hans; concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] AtomicXXX.lazySet and happens-before reasoning



+ rest of the group

On Sat, Oct 8, 2011 at 1:10 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:

Hi Hans,



I was under the impression that lazySet is purely a StoreStore barrier, and only specifies that the lazySet cannot be reordered with prior writes -- I never saw mention of requiring no reordering with prior loads.  Here's Doug's evaluation: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6275329, where only a store-store is mentioned.  If it's really a LoadStore | StoreStore, good to know ...



Thanks



On Sat, Oct 8, 2011 at 12:06 AM, Boehm, Hans <hans.boehm at hp.com> wrote:

LazySet() needs to prevent reordering of ordinary memory operations with a subsequent lazySet() operation.  In the JSR 133 Cookbook style, that can be implemented with a LoadStore | StoreStore fence preceding the lazySet() call.  So yes, that makes sense.



Real machines tend to require neither of those fences (x86) or combine them into a single instruction.



Hans



From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
Sent: Friday, October 07, 2011 5:10 PM
To: Boehm, Hans
Cc: concurrency-interest at cs.oswego.edu; Ruslan Cheremin

Subject: Re: [concurrency-interest] AtomicXXX.lazySet and happens-before reasoning



Does it even make sense to say that lazySet needs a LoadStore fence? The get() does but that's because it has same semantics as volatile read.

On Oct 7, 2011 7:29 PM, "Boehm, Hans" <hans.boehm at hp.com> wrote:

> From: Ruslan Cheremin [mailto:cheremin at gmail.com]
> > It also needs a LoadStore fence, in both cases.
>
> But why lazySet needs LoadStore fence? It seems what lazySet javadoc
> does not put any ordering constraints on loads...
I do read it as imposing such a constraint, though we all agree that a more precise spec would help.  Certainly C++11's memory_order_release imposes such a constraint.

If not, it would mean that e.g.

Thread 1:
x = ...;
...
r1 = x;
done_with_x.lazySet(true);

Thread 2:
if (done_with_x.get()) {
  x = ...;
  ...
  r2 = x;
}

wouldn't work as expected.

In my opinion, that's an indefensible design point, especially since I don't believe it makes lazySet appreciably cheaper on any modern architectures.


>
> > In particular, if v is volatile (and certainly if it's accessed using
> lazySet), and x and y are ordinary variables,
> > then the assignments to x and y in the following may be visibly
> reordered:
> > x = 1;
> > v = 2;
> > y = 3;
>
> You mean what vstore is not "transparent" upside down, but
> "transparent" downside up, so this
>
> y=3
> x=1



--
Vitaly
617-548-7007 (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20111010/2ca7876b/attachment.html>


More information about the Concurrency-interest mailing list