[concurrency-interest] AtomicXXX.lazySet and happens-before reasoning

Boehm, Hans hans.boehm at hp.com
Fri Oct 7 14:02:19 EDT 2011

> From: Doug Lea
> On 10/07/11 09:24, Ruslan Cheremin wrote:
> > After some thinking and reading I still do not understand some
> issues...
> >
> > As far, as I can see from description of lazySet, it is just ordinary
> > store with StoreStore barrier just before it. But if it is so, what
> is
> > the difference between it and ordinary volatile write? In JSR-133
> > implementation cookbook http://gee.cs.oswego.edu/dl/jmm/cookbook.html
> > you've shown volatile store implementation as store having exactly
> > StoreStore barrier before it.
It also needs a LoadStore fence, in both cases.  But even then, it's important to remember that although his may be a sufficient implementation, this is an incorrect description from the user's perspective.  In particular, if v is volatile (and certainly if it's accessed using lazySet), and x and y are ordinary variables, then the assignments to x and y in the following may be visibly reordered:

x = 1;
v = 2;
y = 3;

Volatiles are not fences.

> Plus, for a volatile, a StoreLoad fence between the write and any read.
> Almost always, the only good choice for where to place it is
> immediately after the write. In addition to disabling
> more optimizations, StoreLoad fences translate to
> instructions that are not cheap on any platform, although they are
> currently a lot cheaper than they were about 5 years ago on most
> platforms. But in any case, if you have a situation that is
> guaranteed not to need one to preserve correctness, it is always
> faster not to require one.
And on something like PowerPC, the implementation rules are actually currently quite unclear.  The currently preferred implementation, at least for the C++ equivalents, actually associates much of the volatile overhead with loads, though it's not completely clear that's the right choice.  Just adding a StoreLoad fence to stores isn't sufficient, for fairly complex reasons related to the fact that this view of fences is too simplistic for PowerPC.  If Java implementations follow the recommendations in http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html, then lazySet saves a relatively small percentage of the overhead, and the change amounts to weakening the fence before the store, not dropping one after it.  But that's because the StoreLoad fence is associated with the load, for which Java doesn't have a weaker form.  And having Java follow a different recipe from the C++ one is probably also a bad idea, since it essentially breaks mixed applications.


More information about the Concurrency-interest mailing list