[concurrency-interest] Suggestion: .hardGet() for atomicvariables

Vitaly Davidovich vitalyd at gmail.com
Sat Jan 21 01:24:07 EST 2012


Actually, I think if the "// do some stuff" inside the loop performs any
writes to shared memory, then even the failing CAS will need to provide a
StoreLoad so that another thread spinning on this loop will observe those
writes since we're saying that a failing CAS has volatile-write semantics.
 I don't think we're considering any writes inside the loop in this example
though, so maybe it's moot.

Thanks

On Sat, Jan 21, 2012 at 1:15 AM, Vitaly Davidovich <vitalyd at gmail.com>wrote:

> Ruslan's example, if I understood correctly, looks like this:
>
> long val;
> do
> {
>    val = seq.get();
>    // do some stuff ...
> }  while (!seq.CAS(val, val));
>
> If we take into account that a CAS provides the same barrier whether it
> fails or succeeds and let's say the CAS fails, then it's basically:
> - volatile write semantics via the CAS (even though it failed)
> - volatile load of seq again via seq.get()
>
> In between those two you typically need a StoreLoad (assuming a failing
> CAS still provides that barrier, as documented).  I guess the compiler
> could omit the StoreLoad on the failing branch since it knows that it
> didn't actually store anything and so there's no StoreLoad hazard -- is
> that what you meant perhaps? If so, that seems ok, in theory at least.
>
> On Sat, Jan 21, 2012 at 12:38 AM, Boehm, Hans <hans.boehm at hp.com> wrote:
>
>>  I’m not sure.  Is there any legitimate Java code that could tell the
>> difference?  An x86 load (MOV) can only be reordered with prior stores.
>>  But I don’t immediately see how  an observer thread could tell whether**
>> **
>>
>> ** **
>>
>> x = …****
>>
>> r1 = z.getAndAdd(0);****
>>
>> ** **
>>
>> is reordered.  This would require more thought.  But I agree that it’s
>> not too likely that any compiler would try to specially optimize
>> getAndAdd(0).****
>>
>> ** **
>>
>> Hans****
>>
>> ** **
>>
>> *From:* Vitaly Davidovich [mailto:vitalyd at gmail.com]
>> *Sent:* Friday, January 20, 2012 7:46 PM
>> *To:* Boehm, Hans
>> *Cc:* concurrency-interest at cs.oswego.edu; Ruslan Cheremin; Raph Frank;
>> dholmes at ieee.org
>>
>> *Subject:* Re: [concurrency-interest] Suggestion: .hardGet() for
>> atomicvariables****
>>
>>  ** **
>>
>> MOV + compiler barrier wouldn't be the right transformation on x86
>> because that wouldn't be the same thing as a volatile write (I.e.
>> store-load fence), it would be analogous to just store-store.  I think a
>> cas(x, x) instead of getAndAdd(0) would work and makes sense, although I
>> wonder if compilers care about such special cases - the cas will already
>> possibly incur a perf penalty so the extra add instruction is probably
>> insignificant to optimize.****
>>
>> Vitaly****
>>
>> Sent from my phone****
>>
>> On Jan 20, 2012 8:37 PM, "Boehm, Hans" <hans.boehm at hp.com> wrote:****
>>
>> > From: Raph Frank [mailto:raphfrk at gmail.com]
>> >
>> > Thanks for the info.
>> >
>> > On Fri, Jan 20, 2012 at 11:10 AM, Ruslan Cheremin <cheremin at gmail.com>
>> > wrote:
>> > > long seq = sequence.get();
>> > > ...some reading...
>> > > if( !sequence.CAS(seq, seq) ){
>> > >   //...approach failed -> retry
>> > > }
>> > >
>> > > So, from my point of view, if CAS failed -- we shouldn't actually
>> > care
>> > > about it's ordering semantic (although it was interesting to know --
>> > > thanks, David -- what ordering does not depend on success/fail). If
>> > > CAS succeeded -- it does required guarantee anyway. Am I wrong
>> > > somewhere here?
>> >
>> > Ahh right, that is better than
>> >
>> > if (sequence.getAndAdd(0) != seq) {
>> >   <retry>
>> > }
>> >
>> > Anyway, thanks all for the info.
>>
>> Thanks for the corrections and clarifications.  It does look like we
>> should essentially view CAS as always performing a volatile store, possibly
>> of the original value.
>>
>> It seems to me that the trade-off between CAS and getAndAdd here is
>> highly implementation dependent.  Clearly if getAndAdd(0) is implemented in
>> terms of CAS(x,x), CAS is faster.  I suspect either could in theory be
>> optimized to a plain old MOV + compiler constraints on x86.
>>
>> Hans
>>
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest****
>>
>
>
>
> --
> Vitaly
> 617-548-7007 (mobile)
>



-- 
Vitaly
617-548-7007 (mobile)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120121/b3f9e33d/attachment.html>


More information about the Concurrency-interest mailing list