[concurrency-interest] Semantics of compareAndSwapX

Hans Boehm boehm at acm.org
Thu Feb 27 11:53:47 EST 2014

As far as I know, the ARMv8 acquire/release operations were designed
specifically to act as Java volatile or C++ memory_order_seq_cst load/store
operations, without the kind of ordering overkill that we currently need on
x86, i.e. they were designed to get us to the "better world".

My main remaining concern is that we don't have a complete, much less
provably correct, mapping of either Java or C++ atomics to this ISA.  This
leaves a risk that some corner cases, e.g. C++ explicit fences, will be
difficult to implement correctly in this model.  I am also not at all sure
whether the traditional ARMv7 mappings mix and match with an
acquire/release based mapping.  These mappings should really be specified
in some kind of ABI addednsum.  (But there are already similar issues with
ARMv7 by itself, and with some other architectures.)


On Thu, Feb 27, 2014 at 6:43 AM, David M. Lloyd <david.lloyd at redhat.com>wrote:

> I read Hans' post as "in a better world, volatile read/write would only
> guarantee ordering with respect to other volatile read/write, and memory
> visibility only of the volatile itself, and fences would be a separate
> concern" i.e. so as to reduce the number of fences which I think we've all
> seen can impact performance and parallelism pretty substantially. But of
> course we live in this world, thus making it (for the moment at least) only
> an academic argument.
> I apologize if I misinterpreted though, that's just my reading of it.
> On 02/26/2014 04:39 PM, David Holmes wrote:
>> Hans,
>>> But all of these x86 fence placements are gross overkill, in that they
>> order ALL memory accesses,
>>> when they only need to order VOLATILE accesses.
>> A volatile store has to ensure ordering of all stores prior to the
>> volatile store, so that a read of a volatile flag ensures access to
>> non-volatile data.
>> David
>>     -----Original Message-----
>>     *From:* concurrency-interest-bounces at cs.oswego.edu
>>     [mailto:concurrency-interest-bounces at cs.oswego.edu]*On Behalf Of
>>     *Hans Boehm
>>     *Sent:* Thursday, 27 February 2014 4:11 AM
>>     *To:* Andrew Haley
>>     *Cc:* concurrency-interest at cs.oswego.edu; Stephan Diestelhorst
>>     *Subject:* Re: [concurrency-interest] Semantics of compareAndSwapX
>>     I think there's some confusion between the Java memory model
>>     requirements and common implementation techniques based on fences.
>>       The latter are sufficient to implement the former, but clearly not
>>     required.
>>     On x86, a volatile store is normally implemented by adding a
>>     trailing fence to a store.  That fence is required only to prevent
>>     reordering with a subsequent VOLATILE load; it can actually appear
>>     anywhere between the volatile store and the next volatile load.
>>       Putting it before volatile loads would also work, but is almost
>>     always suboptimal.  In a better world, ABIs would specify one or the
>>     other, and both Java and C should follow those ABIs to ensure
>>     interoperability.
>>     But all of these x86 fence placements are gross overkill, in that
>>     they order ALL memory accesses, when they only need to order
>>     VOLATILE accesses.
>>     On ARMv8, I would expect a volatile store to be compiled to a store
>>     release, and a volatile load to be compiled to a load acquire.
>>       Period.  Unlike on Itanium, a release store is ordered with
>>     respect to a later acquire load, so the fence between them should
>>     not be needed.  Thus there is no a priori reason to expect that a
>>     CAS would require a fence either.
>>     I would argue strongly that a CAS to a thread-private object should
>>     not be usable as a fence. One of the principles of the Java memory
>>     model was that synchronization on thread-private objects should be
>>     ignorable.
>>     I'm hedging a bit here, because the original Java memory model
>>     doesn't say anything about CAS, and I don't fully understand the
>>     details of the ARMv8 model, particularly the interaction between
>>     acquire/release loads and stores and traditional ARM fences.
>>     Hans
>>     On Wed, Feb 26, 2014 at 3:22 AM, Andrew Haley <aph at redhat.com
>>     <mailto:aph at redhat.com>> wrote:
>>         On 02/26/2014 03:18 AM, Hans Boehm wrote:
>>          > I think that's completely uncontroversial.  ARMv8 load
>>         acquire and store
>>          > release are believed to suffice for Java volatile loads and
>>         stores
>>          > respectively.
>>         No, that's not enough: we emit a StoreLoad barrier after each
>>         volatile store
>>         or before each volatile load.
>>          > Even the fence-less implementation used a release store
>>          > exclusive.  Unless I'm missing something, examples like this
>>         should be
>>          > handled correctly by all proposed implementations, whether or
>>         not fences
>>          > are added.
>>          >
>>          > As far as I can tell, the only use case that require the
>>         fences to be added
>>          > are essentially abuses of CAS as a fence.
>>         Well, yes, which is my question: is abusing CAS as a fence
>>         supposed to work?
>>         Andrew.
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> --
> - DML
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20140227/96d55757/attachment-0001.html>

More information about the Concurrency-interest mailing list