[concurrency-interest] Enforcing total sync order on modern hardware

Oleksandr Otenko oleksandr.otenko at oracle.com
Tue Mar 17 19:29:31 EDT 2015

> IRIW is ruled out by both AMD and Intel.  Not a problem.

Does this "ruled out" mean "the IRIW results forbidden by JMM will not 
be observed", or does this mean "the IRIW results forbidden by JMM are 
not supported as not a problem"?


On 17/03/2015 21:42, Stephan Diestelhorst wrote:
> Am Dienstag, 17. März 2015, 14:39:32 schrieb Marko Topolnik:
>> On Tue, Mar 17, 2015 at 11:46 AM, Aleksey Shipilev <
>> aleksey.shipilev at oracle.com> wrote:
>>> If "sharedVar" is also volatile (sequentially consistent), then Wv1
>>> would complete before reading Rwt6.
>> OK, but this wouldn't necessarily happen on a unique global timescale: the
>> "writing" thread would have the ordering Wv1 -> Rwt6; there would be an
>> _independent_ total order of actions on currentTime, and a third, again
>> independent order of actions by the "reading" thread. Due to the
>> distributed nature of coherence the fact that, on one core, Wv1 precedes
>> Rwt6 does not enforce Rrt6 -> Rv1 on another core. It is not obvious that
>> there is transitivity between these individual orders.
>> Particularly note this statement in
>> http://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf:
>> "[the CPU vendor specifications] admit the IRIW behaviour above but, under
>> reasonable assumptions on the strongest x86 memory barrier, MFENCE, adding
>> MFENCEs would not suffice to recover sequential consistency (instead, one
>> would have to make liberal use of x86 LOCK’d instructions). Here the
>> specifications seem to be much looser than the behaviour of implemented
>> processors: to the best of our knowledge, and following some testing, IRIW
>> is not observable in practice, even without MFENCEs. It appears that some
>> JVM implementations depend on this fact, and would not be correct if one
>> assumed only the IWP/AMD3.14/x86-CC architecture."
>> Also, for the newer revision of Intel's specification, “P6. In a
>> multiprocessor system, stores to the same location have a total order” has
>> been replaced by: “Any two stores are seen in a consistent order by
>> processors other than those performing the stores.”
> IRIW is ruled out by both AMD and Intel.  Not a problem.
> Weak architectures, such as ARM, talk about multi-copy atomicity, which
> is what you are after.  On these architectures, fences (DMBs in the ARM
> case) do restore global order through an elaborate construction of who
> saw what etc.
> Nothing to see here, I presume (unless you were talking about a real
> closk, such as the TSC on x86.  But that has interesting semantics and I
> will not feed the trolls.  Some naive notes:
> http://rp-www.cs.usyd.edu.au/~gramoli/events/wttm4/papers/diestelhorst.pdf )
> Thanks,
>    Stephan
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

More information about the Concurrency-interest mailing list