[concurrency-interest] DCL using Fence Intrinsics

Vitaly Davidovich vitalyd at gmail.com
Fri Mar 13 17:28:44 EDT 2015

As I mentioned, Alpha is the only processor I know of that allows for
that.  But getting back to java, it's inconceivable to support such a cpu
since every load of a reference to a class with final fields would entail
cpu fences on that architecture.  So practically speaking, we can eliminate
such a reordering possibility at the hardware level (modulo cpu errata or
whatever).  Now we have the compiler left, and I can't immediately see a
transformation that would invalidate that code -- it loads the field once
into a local, and then only deals with the local (except final
assignment).  On the read path, given it's loaded into a local, compiler
cannot introduce re-reads of that field, so there's no chance of returning
null from the method.  And I can't figure out what type of compiler
transformation would allow observing uninitialized object state.

The reason I mentioned possibility of removing loadFence is because it's a
performance drain on the fast path (once the singleton is established, we
want the reads to be quick).  If you were really keen on supporting
Alpha-like weak memory models, you'd want to introduce a read barrier
specifically for data dependence, which would be a noop for everyone else
(this is akin to linux kernel's smp_read_barrier_depends, which AFAIK only
does anything on Alpha).

So yes, theoretically you always want to put a loadFence there (assuming
that we have only the existing fence intrinsics), but I don't think it's
practically (now or in the future) necessary.

On Fri, Mar 13, 2015 at 4:22 PM, Oleksandr Otenko <
oleksandr.otenko at oracle.com> wrote:

>  Wasn't there a recent thread with a reference to a platform which can
> load even dependent data out of order?
> http://en.wikipedia.org/wiki/Memory_ordering
>    - Dependent loads can be reordered (this is unique for Alpha). If the
>    processor fetches a pointer to some data after this reordering, it might
>    not fetch the data itself but use stale data which it has already cached
>    and not yet invalidated. Allowing this relaxation makes cache hardware
>    simpler and faster but leads to the requirement of memory barriers for
>    readers and writers.[5]
>    <http://en.wikipedia.org/wiki/Memory_ordering#cite_note-5>
> Alex
> On 13/03/2015 19:31, Vitaly Davidovich wrote:
> I mentioned that in my previous reply, but I'm not aware of any JVM
> running on platforms that allow such reordering.  I also highly doubt that
> such a platform would ever be ported to as bug tail would be very long,
> along with JVM having to insert LoadLoad barriers in lots of places where
> refs are read of classes with at least one final field.  If you have a
> concrete/real/practical example of where this reordering can take place,
> I'd love to know about it.
> On Fri, Mar 13, 2015 at 2:38 PM, Oleksandr Otenko <
> oleksandr.otenko at oracle.com> wrote:
>> Vitaly is wrong. The loadFence in Code1 is needed. Without it, it is
>> possible to access the uninitialized fields of the singleton. (the loads
>> may occur before the load of instance)
>> Alex
>> On 13/03/2015 17:27, vikas wrote:
>>> Thanks Vitaly,
>>> and sorry for the improper formatting.
>>> on the second note i was wondering why i wouldn't need loadFence in
>>> *Code1*
>>> DCL Example
>>> JMM cookbook suggest to insert LoadLoad barrier before final field access
>>> (in processor where data dependency is not respected), my example of DCL
>>> added LoadFence only because of this.
>>> Also C++ example does need both the fences
>>> http://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/
>>> I can think of one reason on why It may work in java without LoadFence is
>>> benign data race-like construct
>>> are kind of allowed in Java whereas they are not allowed in C++.
>>> Also so below *Code4* works for DCL Singleton pattern ?
>>>                                                      *Code4*
>>>          sun.misc.Unsafe *U*;
>>>       Singleton instance = null
>>>       Singleton getInstance() {
>>>            Singleton tmp = instance;  // no fence while reading
>>>            if(tmp == null) {
>>>                synchronized(Singleton.class) {
>>>                     tmp = instance;
>>>                     if(tmp == null) {
>>>                         tmp = new Singleton();
>>>                        * U.storeFence();* // only need StoreFence
>>>                         instance = tmp;
>>>                    }
>>>                }
>>>             }
>>>         return tmp;
>>>       }
>>> --
>>> View this message in context:
>>> http://jsr166-concurrency.10961.n7.nabble.com/DCL-using-Fence-Intrinsics-tp12420p12435.html
>>> Sent from the JSR166 Concurrency mailing list archive at Nabble.com.
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150313/b8e813b3/attachment-0001.html>

More information about the Concurrency-interest mailing list