[concurrency-interest] DCL using Fence Intrinsics

Vitaly Davidovich vitalyd at gmail.com
Fri Mar 13 12:38:14 EDT 2015


As I mentioned a few replies earlier, I'm not advocating this.  What I was
trying to establish, purely for educational purpose, is a real example of
either compiler and/or cpu transform that would invalidate it.

On Fri, Mar 13, 2015 at 12:28 PM, Oleksandr Otenko <
oleksandr.otenko at oracle.com> wrote:

>  No, you look at it wrong way around.
>
> Compare the effort of proving the correctness of the resulting code to the
> performance improvement.
>
> Alex
>
>
>
> On 13/03/2015 15:06, Vitaly Davidovich wrote:
>
> Saving is easy to justify: it's at best a heavy compiler fence and at
> worst both compiler and cpu.  All loop optimizers (generally) try to move
> loop invariant stuff out, not in.
>
> sent from my phone
> On Mar 13, 2015 11:03 AM, "Oleksandr Otenko" <oleksandr.otenko at oracle.com>
> wrote:
>
>>  First you need to justify the saving of loadFence. You can't assume the
>> saving is significant (first it must be predictable) and at the same time
>> assume the load of a / breaking the loop is not predictable.
>>
>> Alex
>>
>> On 13/03/2015 14:55, Vitaly Davidovich wrote:
>>
>>  So I thought it might be shady, but I can't come up with a *legitimate*
>> case where it breaks.  One possibility is following reordering:
>>
>>  else {
>>     do {
>>        U.loadFence();
>>         // sink the 'a' read into here, it's still 0, then 'b' reads 1
>> and we break
>>     }while(b!=1);
>>
>>  I can't immediately see why such a transformation would take place
>> because for compiler to do that, it would have to prove that the loop
>> always executes only once (otherwise it's moving a load ahead of a
>> loadFence).  It's also making a loop invariant read into a variant one.  I
>> guess it could clone the code into 2 separate versions, one for looping and
>> one for not, but seems weird and useless.  I suppose CPU could speculate
>> somehow here, but again, not immediately clear to me why it would speculate
>> ahead of 'b' when 'b' is read possibly many times and 'a' is read just once.
>>
>>  But you're right, this "trick" isn't reliable.
>>
>>
>>
>> On Fri, Mar 13, 2015 at 10:33 AM, Oleksandr Otenko <
>> oleksandr.otenko at oracle.com> wrote:
>>
>>>  No, you have just shown that you don't need a loadFence after the loop,
>>> which is wrong.
>>>
>>> You need a loadFence between the last load of b and the load of a, to
>>> preserve the order of loading a after loading b. Then you need a loadFence
>>> between loads of b, so you keep re-loading b on each iteration.
>>>
>>> Alex
>>>
>>>
>>> On 13/03/2015 14:23, Vitaly Davidovich wrote:
>>>
>>> btw, for #3, you'd probably want to rewrite T2 as:
>>>
>>>  if (b==1) {
>>>    U.loadFence();
>>> } else {
>>>     do {
>>>        U.loadFence();
>>>     }while(b!=1);
>>> }
>>>
>>>  assert(a==1);
>>>
>>>  This would avoid an additional load fence upon exiting the while loop
>>> (if the while loop was actually entered).
>>>
>>>
>>> On Fri, Mar 13, 2015 at 10:10 AM, Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>>> Yeah, I read #2 as the while loop being in T1, but if it's T2, then
>>>> yes, it's fine and will work.
>>>>
>>>>  Thanks for clarifying #3 -- I meant to keep existing code as is but
>>>> stuff a loadFence into the loop, but re-reading my reply, I do see how it
>>>> can be interpreted as moving the existing one.
>>>>
>>>> On Fri, Mar 13, 2015 at 9:50 AM, Oleksandr Otenko <
>>>> oleksandr.otenko at oracle.com> wrote:
>>>>
>>>>>  On 12/03/2015 23:01, Vitaly Davidovich wrote:
>>>>>
>>>>> 1 works, and I can't see why you even need the loadFence.
>>>>>
>>>>> 2 and 3 won't (always) work.  In 2, compiler can move a=1 after the
>>>>> loop.  For 3, if you put loadFence inside the while loop it will work.
>>>>>
>>>>>
>>>>>  If we assume the loop in 2 was meant to be in T2, then it will work.
>>>>>
>>>>> For 3, you need to have loadFence inside the loop *and* after the
>>>>> loop.
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>  sent from my phone
>>>>> On Mar 12, 2015 6:43 PM, "vikas" <vikas.vksingh at gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>   I am trying to understand the fence intrinsic api.
>>>>>>   Pershing has showw how to write DCL in C++ in his blog
>>>>>>
>>>>>> http://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/
>>>>>>
>>>>>>   I was trying to have a similar thing in Java (*Code1*)
>>>>>>
>>>>>>      sun.misc.Unsafe U;
>>>>>>      Singleton instance = null
>>>>>>
>>>>>>      Singleton getInstance() {
>>>>>>           Singleton tmp = instance;
>>>>>>          * U.loadFence();*
>>>>>>           if(tmp == null) {
>>>>>>               synchronized(Singleton.class) {
>>>>>>                    tmp = instance;
>>>>>>                    if(tmp == null) {
>>>>>>                        tmp = new Singleton();
>>>>>>                        *U.storeFence();*
>>>>>>                        instance = tmp;
>>>>>>                   }
>>>>>>               }
>>>>>>            }
>>>>>>        return tmp;
>>>>>>      }
>>>>>>                                     *Code1*
>>>>>>
>>>>>>    * Will the above Code1 works? *
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>>
>>>>>>     On similar lines i have another doubt. See below *Code2*.
>>>>>>     if * a* and *b* are normal variables with initial value 0
>>>>>>
>>>>>>        T1                                                     T2
>>>>>>      a = 1;
>>>>>> while(unsafe.getIntVolatile(b)!=1);
>>>>>>      unsafe.putIntOrdered(b,1);         assert(a==1); // *will always
>>>>>> pass*
>>>>>>
>>>>>>                                      *Code2*
>>>>>>
>>>>>>     Code2 works because putXXXOrdered and getXXXVolatile forms a
>>>>>> happens
>>>>>> before edge.
>>>>>>     i.e. assert in Thread T2 will always pass.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------------------------------
>>>>>>     But can we say the same thing for below code (*Code3*)
>>>>>>
>>>>>>        T1                                                        T2
>>>>>>      a = 1;                                               while(b!=1);
>>>>>>      unsafe.storeFence();
>>>>>>  unsafe.loadFence();
>>>>>>      b = 1;
>>>>>>  assert(a==1);
>>>>>>                                      *Code3*
>>>>>>
>>>>>>   * /What  prevents the compiler to optimize the while loop in
>>>>>> *Code3* to an
>>>>>> infinte loop./*
>>>>>>    So does *Code3 *works? If not, then is there anyway we can achieve
>>>>>> the
>>>>>>    expected behavior using fences.
>>>>>>
>>>>>>    thanks
>>>>>>    vikas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://jsr166-concurrency.10961.n7.nabble.com/DCL-using-Fence-Intrinsics-tp12420.html
>>>>>> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
>>>>>> _______________________________________________
>>>>>> Concurrency-interest mailing list
>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Concurrency-interest mailing listConcurrency-interest at cs.oswego.eduhttp://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150313/837c52db/attachment-0001.html>


More information about the Concurrency-interest mailing list