[concurrency-interest] RFR: 8065804: JEP 171: Clarifications/corrections for fence intrinsics

David Holmes davidcholmes at aapt.net.au
Mon Dec 8 03:46:50 EST 2014


Martin,

The paper you cite is about ARM and Power architectures - why do you think the lack of mention of x86/sparc implies those architectures are multiple-copy-atomic?

David

> -----Original Message-----
> From: Martin Buchholz [mailto:martinrb at google.com]
> Sent: Monday, 8 December 2014 6:42 PM
> To: David Holmes
> Cc: David Holmes; Vladimir Kozlov; core-libs-dev; concurrency-interest
> Subject: Re: [concurrency-interest] RFR: 8065804: JEP 171:
> Clarifications/corrections for fence intrinsics
> 
> 
> On Sun, Dec 7, 2014 at 2:58 PM, David Holmes 
> <david.holmes at oracle.com> wrote:
> 
> >> I believe the comment _does_ reflect hotspot's current implementation
> >> (entirely from exploring the sources).
> >> I believe it's correct to say "all of the platforms are
> >> multiple-copy-atomic except PPC".
> 
> ... current hotspot sources don't contain ARM support.
> 
> > Here is the definition of multi-copy atomicity from the ARM architecture
> > manual:
> >
> > "In a multiprocessing system, writes to a memory location are multi-copy
> > atomic if the following conditions are both true:
> > • All writes to the same location are serialized, meaning they 
> are observed
> > in the same order by all observers, although some observers might not
> > observe all of the writes.
> > • A read of a location does not return the value of a write until all
> > observers observe that write."
> 
> The hotspot sources give
> 
> """
> // To assure the IRIW property on processors that are not multiple copy
> // atomic, sync instructions must be issued between volatile reads to
> // assure their ordering, instead of after volatile stores.
> // (See "A Tutorial Introduction to the ARM and POWER Relaxed 
> Memory Models"
> // by Luc Maranget, Susmit Sarkar and Peter Sewell, INRIA/Cambridge)
> #ifdef CPU_NOT_MULTIPLE_COPY_ATOMIC
> const bool support_IRIW_for_not_multiple_copy_atomic_cpu = true;
> """
> 
> and the referenced paper gives
> 
> """
> on POWER and ARM, two threads can observe writes to different
> locations in different orders, even in
> the absence of any thread-local reordering. In other words, the
> architectures are not multiple-copy atomic [Col92].
> """
> 
> which strongly suggests that x86 and sparc are OK.
> 
> > The first condition is met by Total-Store-Order (TSO) systems 
> like x86 and
> > sparc; and not by relaxed-memory-order (RMO) systems like ARM and PPC.
> > However the second condition is not met simply by having TSO. 
> If the local
> > processor can see a write from the local store buffer prior to it being
> > visible to other processors, then we do not have multi-copy 
> atomicity and I
> > believe that is true for x86 and sparc. Hence none of our supported
> > platforms are multi-copy-atomic as far as I can see.
> >
> >> I believe hotspot must implement IRIW correctly to fulfil the promise
> >> of sequential consistency for standard Java, so on ppc volatile reads
> >> get a full fence, which leads us back to the ppc pointer chasing
> >> performance problem that started all of this.
> >
> >
> > Note that nothing in the JSR-133 cookbook allows for IRIW, even 
> on x86 and
> > sparc. The key feature needed for IRIW is a load barrier that 
> forces global
> > memory synchronization to ensure that all processors see writes 
> at the same
> > time. I'm not even sure we can force that on x86 and sparc! Such a load
> > barrier negates the need for some store barriers as defined in 
> the cookbook.
> >
> > My understanding, which could be wrong, is that the JMM implies
> > linearizability of volatile accesses, which in turn provides the IRIW
> > property. It is also my understanding that linearizability is a 
> necessary
> > property for current proof systems to be applicable. However absence of
> > proof is not proof of absence, and it doesn't follow that code 
> that doesn't
> > rely on IRIW is incorrect if IRIW is not ensured on a system. 
> As has been
> > stated many times now, in the literature no practical lock-free 
> algorithm
> > seems to rely on IRIW. So I still hope that IRIW can somehow be removed
> > because implementing it will impact everything related to the JMM in
> > hotspot.
> 




More information about the Concurrency-interest mailing list