[concurrency-interest] AtomicReference.updateAndGet() mandatory updating

Hans Boehm boehm at acm.org
Fri May 26 14:19:48 EDT 2017

Someone from ARM should chime in here, but my understanding is that ARMv8
acquire/release loads and stores are designed to exactly model C++
memory_order_seq_cst (NOT memory_order_acquire/memory_order_release) loads
and stores. They do NOT imply fences. They are not intended to implement
fences. They should not be used to implement fences. The architecture still
has fences, so there is no need to.

For example,

r1 = x;
store release to v;
r2 = y;

Does not order the accesses to x and y any more than a C++ sequentially
consistent store would order relaxed accesses to x and y.

Atomic RMW operations implemented with ARM acquire/release primitives have
roughly the memory ordering semantics of a RMW operation implemented with a
lock. They are NOT fences, should NOT be used to implement fences, etc. For

r1 = x;
r2 = y;

does NOT order the accesses to x and y, since both can move into the
critical section and pass each other. The same applies to ARMv8 RMW

My reading of the spec is that a sequentially consistent store followed by
a sequentially consistent load is still not sufficient to generate the
equivalent of a fence. (I would guess that on current hardware it probably
is, but I don't know.) If there are no observers of the release store, it
promises essentially no ordering. There is no good reason to that anyway.

AFAICT, the discussion about atomic RMW as fence replacement is entirely
x86-specific. I'm not sure, but it seems to be caused by the fact that an
x86 MFENCE makes all sorts of other guarantees about write-coalescing
memory, etc., that we don't really care about. The RMW operations do not,
and are thus often faster. My guess is that the problem originates from the
fact that x86 doesn't have a suitably plain vanilla fence instruction.

I'm not sure how this interacts with the original discussion. There's still
the interesting question of whether a volatile write that doesn't change
the value of an object is observable.

On Fri, May 26, 2017 at 9:43 AM, Andrew Haley <aph at redhat.com> wrote:

> On 26/05/17 17:09, Gil Tene wrote:
> > loads or stores that appear in program order before the store-release"
> >
> > So ***for ARMv8*** a store-release followed by a load-aquire (e.g. both
> the a thread local) will impose a StoreLoad order.
> >
> > [This is not a general property of store-release and load-aquire]
> That's right.  By the way, the memory model for ARM has been rewritten,
> and the engineer who wrote it promises me absolutely and truly that the
> instructions are sequentially consistent, and were always intended to be.
> https://developer.arm.com/docs/ddi0487/latest/arm-
> architecture-reference-manual-armv8-for-armv8-a-architecture-profile
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20170526/1e1c947a/attachment-0001.html>

More information about the Concurrency-interest mailing list