[concurrency-interest] AtomicReference.updateAndGet() mandatory updating

Alex Otenko oleksandr.otenko at gmail.com
Mon May 29 05:42:46 EDT 2017


Sorry, but I don’t see how you separate synchronization properties of CAS and atomicity :-)

I don’t see how you could describe atomicity without specifying the place of CAS with respect to the other stores. Once you placed it somewhere among the other stores, it synchronizes-with those preceding it.

Now, atomicity of a succeeding CAS is not falsifiable. It can just as well be non-atomic, and succeed, if the other stores were ordered in the same way. There is no meaning whatsoever in declaring a succeeding CAS atomic.

Successful CAS atomic        Successful CAS not atomic
store z 0                    store z 0
CAS z 0 1                    load z
                             store z 1
store z 2                    store z 2

Can you detect the effects of a successful CAS being not atomic? What does atomicity of a successful CAS promise? I see nothing.


There is a difference between atomic and non-atomic failing CAS - that’s where it makes sense to specify whether it is atomic or not.

Failing CAS atomic intrinsic          Failing CAS not atomic
                             Not detectable    Detectable             Not detectable
store z 0                    store z 0         store z 0              store z 0
store z 2                    store z 2         load z                 load z
CAS z 0 1                    load z              store z 2              store z 2
                                               // store z 1 skipped   // store z 2 triggers retry
                                                                      load z
                                                                      // store z 1 skipped

If non-atomicity of a failing CAS can be detected, it becomes even closer to weakCompareAndSet, which fails spuriously, and is a concern. On the other hand, it may just as well promise atomicity even of a failing CAS, because it needs to distinguish a spurious failure of the underlying ll/sc primitive, and the procedure for distinguishing that possibly necessarily establishes the synchronizes-with edge with the store that failed it.

I don’t see all ends, so maybe someone wants to not promise atomicity of the failing strong CAS. But in that case there is no need to promise atomicity at all, because the promise of atomicity of a succeeding CAS gives you nothing. Unless you can show how a non-atomic successful CAS could be detected?


Alex

> On 29 May 2017, at 09:31, David Holmes <davidcholmes at aapt.net.au> wrote:
> 
> Sorry but I don’t see what you describe as atomicity. The atomicity of a successful CAS is the only atomicity the API is concerned about. The memory synchronization properties of CAS are distinct from its atomicity property.
>  
> David
>  
> From: Concurrency-interest [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf Of Alex Otenko
> Sent: Monday, May 29, 2017 6:15 PM
> To: dholmes at ieee.org
> Cc: concurrency-interest at cs.oswego.edu
> Subject: Re: [concurrency-interest] AtomicReference.updateAndGet() mandatory updating
>  
> Thanks.
>  
> No, I am not concerned about the atomicity of hardware instructions. I am concerned about atomicity as the property of the memory model.
>  
> Claiming atomicity of a successful CAS is pointless. If CAS is not atomic on failure, then there is no need to claim it is atomic at all.
>  
> Example where you can claim atomicity of a failing CAS:
>  
> do{
>   tmp = load_linked(z);
> } while(tmp == expected && store_conditional(z, updated));
>  
> Here if store_conditional fails, it is followed by another volatile load, so the construct will synchronize-with the write that failed it, and it will appear atomic to the observer.
>  
>  
> Alex
>  
>  
>> On 29 May 2017, at 09:03, David Holmes <davidcholmes at aapt.net.au <mailto:davidcholmes at aapt.net.au>> wrote:
>>  
>> Sorry Alex but you are using “atomicity” in a way that doesn’t make sense to me. The only thing that is atomic is the successful CAS. I see what you are trying to say about a failing ll/sc CAS and the write that caused it to fail, but that is not “atomicity” to me – at least from the API perspective. You seem to be concerned about the atomicity of a sequence of hardware instructions. The API doesn’t tell you anything about how the implementation is done, only that the result of a successful operation is atomic with respect to any other update of the variable.
>>  
>> David
>>  
>> From: Alex Otenko [mailto:oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com>] 
>> Sent: Monday, May 29, 2017 5:55 PM
>> To: dholmes at ieee.org <mailto:dholmes at ieee.org>
>> Cc: Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>>; concurrency-interest at cs.oswego.edu <mailto:concurrency-interest at cs.oswego.edu>
>> Subject: Re: [concurrency-interest] AtomicReference.updateAndGet() mandatory updating
>>  
>> This came out a bit garbled. So, here it goes a bit clearer why the spec and the “ubiquitous terminology” are not enough, perhaps.
>>  
>> The claim of “atomicity” for succeeding CAS is not interesting, because it is not falsifiable: if CAS succeeded, it is evidence in itself that no volatile write appeared between the read and write parts of CAS, not evidence of atomicity as the property of the construct. We cannot explain atomicity of CAS by giving the specification of effects of the successful CAS. But Javadocs does just that, and *only* that.
>>  
>> ll/sc as a construct does not synchronize-with the write failing the sc instruction. So if CAS that uses ll/sc does not make efforts to synchronize-with that write, we can detect it is not atomic - we can detect that it cannot be seen as an operation that appeared entirely before or after all stores to the same variable.
>>  
>> So I am asking whether the *failing* CAS promises atomicity.
>>  
>>  
>> Alex
>>  
>>  
>>> On 29 May 2017, at 00:26, Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com>> wrote:
>>>  
>>> Yeah, I know what atomicity means in x86. But since the “write semantics” of the CAS are questioned, I have to also ask whether the other formulations are precise enough.
>>>  
>>> Atomicity means “indivisible”. It means that it either appears before a store, or after a store. If it appears after the store, then it synchronizes-with that store, and I am bound to observe stores preceding it. But not so in the weaker semantics Hans talks about! If the failure occurs during the sc part, you have to assume the load is before that store (but why does it fail then), or you have to assume it overlaps with a concurrent store. Either way, the core function is *not* atomic.
>>>  
>>> Unless there are extra volatile loads upon failure of (strong) compareAndSet.
>>>  
>>> It’s not just the “no intervening store”, meaning “if it’s stored, the condition expected=actual was not violated by any other store”.
>>>  
>>> The gist of atomicity:
>>>  
>>> int x=0;
>>> volatile int z=0;
>>>  
>>> Thread 1:
>>> if (! CAS(z, 0, 1)) {
>>>   return x;
>>> }
>>> return 1;
>>>  
>>> Thread 2:
>>> x=1;
>>> z=1;
>>>  
>>> If CAS is atomic, failing CAS synchronizes-with the volatile write that fails it, and Thread 1 will always return 1. 
>>>  
>>> Alex
>>>  
>>>  
>>>  
>>>> On 28 May 2017, at 23:52, David Holmes <davidcholmes at aapt.net.au <mailto:davidcholmes at aapt.net.au>> wrote:
>>>>  
>>>> Alex,
>>>>  
>>>> I don’t recall anyone ever questioning what the atomic means in these atomic operations – it is ubiquitous terminology. If the store happens it is because the current value was the expected value. That is indivisible ie atomic. There can be no intervening store. This is either the semantics of the hardware instruction (e.g. cmpxchg) or else must be emulated using whatever is available e.g. ll/sc instructions (where an intervening store, in the strong CAS, must cause a retry).
>>>>  
>>>> David
>>>>  
>>>> From: Concurrency-interest [mailto:concurrency-interest-bounces at cs.oswego.edu <mailto:concurrency-interest-bounces at cs.oswego.edu>] On Behalf Of Alex Otenko
>>>> Sent: Monday, May 29, 2017 7:40 AM
>>>> To: Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>>
>>>> Cc: concurrency-interest at cs.oswego.edu <mailto:concurrency-interest at cs.oswego.edu>
>>>> Subject: Re: [concurrency-interest] AtomicReference.updateAndGet() mandatory updating
>>>>  
>>>> Yes, you could read it both ways. You see, lock-based implementations and x86 LOCK:CMPXCHG semantics inspire to interpret the statement such that there is at least some write-like semantics (hence “memory *effects*”) - not necessarily a write to z, but fences or whatever that imitates a volatile write to z from JMM.
>>>>  
>>>>  
>>>> The other source of confusion is the claim of atomicity. Is it “atomically (sets the value) (to the given updated value if the current value = the expected value)” or “atomically (sets the value to the given updated value if the current value == the expected value)”? Does atomicity imply it is a single item in total order of all operations? Or all stores? Or just stores to that variable? If you know how it’s implemented, it turns out it is far from atomic.
>>>>  
>>>> Does it at least *implement* atomic behaviour, does it *appear* atomic to an observer? For example, if a concurrent store appears between the load and “the store”  (in quotes, because it may not be executed - so in that case it is no longer “between”), do we get synchronizes-with edge with the store that preceded the load, or also the store that intervened? If we don’t get synchronizes-with edge to the store that intervened (which I suspect it doesn’t), then it is not atomic in any of those senses (but x86 and lock-based implementations create false analogies, so we get “atomic” in the method description).
>>>>  
>>>>  
>>>> It needs to be specced out, best of all formally in JMM as the source of authority, rather than higher-level API javadocs, spread all over the place.
>>>>  
>>>> Alex
>>>>  
>>>>  
>>>>> On 28 May 2017, at 18:30, Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>> wrote:
>>>>>  
>>>>> Thanks. I think I understand now. If Thread 2 returns false, the Thread 2 CAS failed, and the initial CAS in Thread 1 succeeds. Either x immediately reads back as 1 in Thread 1, or we set b to true after Thread 2 returns b. Thus the second (successful) CAS in Thread 1 must follow the unsuccessful Thread 2 CAS in synchronization order. So any write to z by the failed CAS synchronizes with the second successful CAS in Thread 1, and we could thus conclude that x is 1 in the Thread 1 return.
>>>>>  
>>>>> This relies critically on the assumption that the Thread 2 failed CAS has the semantics of a volatile write to z.
>>>>>  
>>>>> I think the actual relevant spec text is:
>>>>>  
>>>>> 1) "compareAndSet and all other read-and-update operations such as getAndIncrement have the memory effects of both reading and writing volatile variables."
>>>>>  
>>>>> 2) "Atomically sets the value to the given updated value if the current value == the expected value."
>>>>>  
>>>>> I would not read this as guaranteeing that property. But I agree the spec doesn't make much sense; I read (2) as saying there is no write at all if the CAS fails, as I would expect. Thus it seems like a stretch to assume that the write from (1) is to z, though I have no idea what write it would refer to.
>>>>>  
>>>>> The prior implementation discussion now does make sense to me. I don't think this is an issue for lock-based implementations. But the only reasonable way to support it on ARMv8 seems to be with a conditionally executed fence in the failing case. That adds two instructions, as well as a large amount of time overhead for algorithms that don't retry on a strong CAS. My impression is that those algorithms are frequent enough to be a concern.
>>>>>  
>>>>>  
>>>>> On Sat, May 27, 2017 at 4:49 PM, Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com>> wrote:
>>>>>> That’s right.
>>>>>>  
>>>>>> Atomicity (for some definition of atomicity - ie atomic with respect to which operations) is not needed here. As long as the store in CAS occurs always, x=1 is not “reordered” (certainly, not entirely - can’t escape the “store” that is declared in the spec).
>>>>>>  
>>>>>> Alex
>>>>>>  
>>>>>>> On 28 May 2017, at 00:43, Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>> wrote:
>>>>>>>  
>>>>>>> I gather the interesting scenario here is the one in which the Thread 2 CAS fails and Thread 2 returns false, while the initial Thread 1 CAS succeeds?
>>>>>>>  
>>>>>>> The correctness argument here relies on the fact that the load of x in Thread 1 must, in this scenario, see the store of x in Thread 2? This assumes the load of z in the failing CAS in Thread 2 can't be reordered with the ordinary (and racey!) store to x by the same thread. I agree that the j.u.c.atomic spec was not clear in this respect, but I don't think it was ever the intent to guarantee that. It's certainly false for either a lock-based or ARMv8 implementation of CAS. Requiring it would raise serious questions about practical implementability on several architectures.
>>>>>>>  
>>>>>>> The C++ standard is quite clear that this is not required; atomicity means only that the load of a RMW operation sees the immediately prior write in the coherence order for that location. It doesn't guarantee anything about other accesses somehow appearing to be performed in the middle of the operation. It's completely analogous to the kind of atomicity you get in a lock-based implementation.
>>>>>>>  
>>>>>>> On Sat, May 27, 2017 at 3:26 PM, Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com>> wrote:
>>>>>>>> Not sure what you mean by “acting as a fence” being broken.
>>>>>>>>  
>>>>>>>> There’s probably even more code that relies on atomicity of CAS - that is, when the write happened on successful CAS, it happened atomically with the read; it constitutes a single operation in the total order of all volatile stores.
>>>>>>>>  
>>>>>>>>  
>>>>>>>> int x=0; // non-volatile
>>>>>>>> volatile int z=0;
>>>>>>>> volatile boolean b=false;
>>>>>>>>  
>>>>>>>> Thread1:
>>>>>>>> if (CAS(z, 0, 1)) {
>>>>>>>>   if (x == 0) {
>>>>>>>>     b=true;
>>>>>>>>     CAS(z, 1, 2);
>>>>>>>>   }
>>>>>>>> }
>>>>>>>> return x;
>>>>>>>>  
>>>>>>>> Thread2:
>>>>>>>> x=1;
>>>>>>>> if (!CAS(z, 0, 2)) {
>>>>>>>>   return b;
>>>>>>>> }
>>>>>>>> return true;
>>>>>>>>  
>>>>>>>> In essence, if CAS failure is caused by a real mismatch of z (not a spurious failure), then we can guarantee there is a return 1 or a further CAS in the future from the point of the first successful CAS (by program order), and we can get a witness b whether that CAS is in the future from the point of the failing CAS (by total order of operations).
>>>>>>>>  
>>>>>>>> If failing CAS in Thread2 does not have store semantics, then nothing in Thread1 synchronizes-with it, and Thread1 is not guaranteed to return 1 even if Thread2 returns false.
>>>>>>>>  
>>>>>>>> If failing CAS in Thread2 does have store semantics, then if Thread2 returns false, Thread1 returns 1.
>>>>>>>>  
>>>>>>>>  
>>>>>>>> Not sure what you mean by “real programming concerns”. It sounds a bit like “true Scotsman”. The concern I am trying to convey, is that Java 8 semantics offer a very strong CAS that can be used to enforce mutual exclusion using a single CAS call, and that this can be combined with inductive types to produce strong guarantees of correctness. Having set the field right, I can make sure most contenders execute less than a single CAS after mutation. Sounds real enough concern to me :)
>>>>>>>>  
>>>>>>>>  
>>>>>>>> Anyhow, I also appreciate that most designs do not look that deep into the spec, and won’t notice the meaning getting closer to the actual hardware trends. If Java 8 CAS semantics gets deprecated, the algorithm will become obsolete, and will need modification with extra fences in the proprietary code that needs it, or whatever is not broken in the new JMM that will lay the memory semantics of CAS to rest.
>>>>>>>>  
>>>>>>>>  
>>>>>>>> Alex
>>>>>>>>  
>>>>>>>>> On 27 May 2017, at 18:34, Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>> wrote:
>>>>>>>>>  
>>>>>>>>> This still makes no sense to me. Nobody is suggesting that we remove the volatile read guarantee on failure (unlike the weak... version). If the CAS fails, you are guaranteed to see memory affects that happen before the successful change to z. We're talking about the "volatile write semantics" for the write that didn't happen.
>>>>>>>>>  
>>>>>>>>> This would all be much easier if we had a litmus test (including code snippets for all involved threads) that could distinguish between the two behaviors. I conjecture that all such tests involve potentially infinite loops, and that none of them reflect real programming concerns.
>>>>>>>>>  
>>>>>>>>> I also conjecture that there exists real code that relies on CAS acting as a fence. We should be crystal clear that such code is broken.
>>>>>>>>>  
>>>>>>>>> On Fri, May 26, 2017 at 11:42 PM, Alex Otenko <oleksandr.otenko at gmail.com <mailto:oleksandr.otenko at gmail.com>> wrote:
>>>>>>>>>> Integers provide extra structure to plain boolean “failed/succeeded”. Linked data structures with extra dependencies of their contents can also offer extra structure.
>>>>>>>>>>  
>>>>>>>>>> if( ! z.CAS(i, j) ) {
>>>>>>>>>>   k = z.get();
>>>>>>>>>>   if(k < j) {
>>>>>>>>>>     // i < k < j
>>>>>>>>>>     // whoever mutated z from i to k, should also negotiate mutation of z from k to j
>>>>>>>>>>     // with someone else, and they should observe whatever stores precede z.CAS
>>>>>>>>>>     // because I won’t contend.
>>>>>>>>>>  
>>>>>>>>>>     // of course, I need to check they are still at it - but that, too, does not require
>>>>>>>>>>     // stores or CASes
>>>>>>>>>>     ...
>>>>>>>>>>     return;
>>>>>>>>>>   }
>>>>>>>>>> }
>>>>>>>>>>  
>>>>>>>>>> If whoever mutated z from i to k cannot observe stores that precede z.CAS, they won’t attempt to mutate z to j.
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> In return can someone explain what the difference is between a weakCompareAndSet failing spuriously and compareAndSet not guaranteeing volatile store semantics on fail? Why should we weaken the promise, if there is already a weak promise to not guarantee visibility on fail?
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> Alex
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> On 26 May 2017, at 22:35, Hans Boehm <boehm at acm.org <mailto:boehm at acm.org>> wrote:
>>>>>>>>>>>  
>>>>>>>>>>> Could we please get an example (i.e. litmus test) of how the "memory effect of at least one volatile ... write" is visible, and where it's useful? Since some people seem really attached to it, it shouldn't be that hard to generate a litmus test.
>>>>>>>>>>>  
>>>>>>>>>>> So far we have a claim that it could affect progress guarantees, i.e. whether prior writes eventually become visible without further synchronization. I kind of, sort of, half-way believe that.
>>>>>>>>>>>  
>>>>>>>>>>> I haven't been able to make sense out of the subsequent illustration attempts. I really don't think it makes sense to require such weird behavior unless we can at least clearly define exactly what the weird behavior buys us. We really need a concise, or at least precise and understandable, rationale.
>>>>>>>>>>>  
>>>>>>>>>>> As has been pointed out before, a volatile write W by T1 to x of the same value that was there before is not easily observable. If I read that value in another thread T2, I can't tell which write I'm seeing, and hence hence a failure to see prior T1 writes is OK; I might have not seen the final write to x. Thus I would need to communicate the  fact that T1 completed W without actually looking at x. That seems to involve another synchronization of T1 with T2, which by itself would ensure the visibility of prior writes to T2.
>>>>>>>>>>>  
>>>>>>>>>>> Thus, aside from possible really obscure progress/liveness issues, I really don't see the difference. I think this requirement, if it is indeed not vacuous and completely ignorable, would lengthen the ARMv8 code sequence for a CAS by at least 2 instructions, and introduce a very obscure divergence from C and C++.
>>>>>>>>>>>  
>>>>>>>>>>> I'm worried that we're adding something to make RMW operations behave more like fences. They don't, they can't, and they shouldn't.
>>>>>>>>>>>  
>>>>>>>>>>> On Fri, May 26, 2017 at 1:08 PM, Nathan and Ila Reynolds <nathanila at gmail.com <mailto:nathanila at gmail.com>> wrote:
>>>>>>>>>>>> > "The memory effects of a write occur regardless of outcome."
>>>>>>>>>>>> > "This method has memory effects of at least one volatile read and write."
>>>>>>>>>>>> 
>>>>>>>>>>>> I am not sure what memory effects means.  If this is defined somewhere in the specs, then ignore this since I haven't read JDK 9 specs.
>>>>>>>>>>>> 
>>>>>>>>>>>> Does memory effects mean the cache line will be switched into the modified state even if an actual write doesn't occur?  Or does memory effects have to do with ordering of memory operations with respect to the method's operation?
>>>>>>>>>>>> 
>>>>>>>>>>>> -Nathan
>>>>>>>>>>>> 
>>>>>>>>>>>> On 5/26/2017 1:59 PM, Doug Lea wrote:
>>>>>>>>>>>>> On 05/26/2017 12:22 PM, Gil Tene wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Actually this is another case where the Java 9 spec needs to be adjusted…
>>>>>>>>>>>>> The pre-jdk9 method for weak CAS is now available in four
>>>>>>>>>>>>> flavors: weakCompareAndSetPlain, weakCompareAndSet,
>>>>>>>>>>>>> weakCompareAndSetAcquire, weakCompareAndSetRelease.
>>>>>>>>>>>>> They have different read/write access modes. The specs reflect this.
>>>>>>>>>>>>> The one keeping the name weakCompareAndSet is stronger, the others
>>>>>>>>>>>>> weaker than before (this is the only naming scheme that works).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> About those specs... see JBS JDK-8181104
>>>>>>>>>>>>>    https://bugs.openjdk.java.net/browse/JDK-8181104 <https://bugs.openjdk.java.net/browse/JDK-8181104>
>>>>>>>>>>>>> The plan is for all CAS VarHandle methods to include the sentence
>>>>>>>>>>>>>    "The memory effects of a write occur regardless of outcome."
>>>>>>>>>>>>> And for j.u.c.atomic methods getAndUpdate, updateAndGet,
>>>>>>>>>>>>> getAndAccumulate, accumulateAndGet to include the sentence:
>>>>>>>>>>>>>    "This method has memory effects of at least one volatile read and write."
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Which should clear up confusion.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -Doug
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Concurrency-interest mailing list
>>>>>>>>>>>>> Concurrency-interest at cs.oswego.edu <mailto:Concurrency-interest at cs.oswego.edu>
>>>>>>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest <http://cs.oswego.edu/mailman/listinfo/concurrency-interest> 
>>>>>>>>>>>> -- 
>>>>>>>>>>>> -Nathan
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Concurrency-interest mailing list
>>>>>>>>>>>> Concurrency-interest at cs.oswego.edu <mailto:Concurrency-interest at cs.oswego.edu>
>>>>>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest <http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>>>>>>>>>>>  
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Concurrency-interest mailing list
>>>>>>>>>>> Concurrency-interest at cs.oswego.edu <mailto:Concurrency-interest at cs.oswego.edu>
>>>>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest <http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20170529/b5e872f7/attachment-0001.html>


More information about the Concurrency-interest mailing list