[concurrency-interest] Enforcing total sync order on modern hardware

Marko Topolnik marko at hazelcast.com
Tue Mar 17 02:31:09 EDT 2015


There is another concern that may be interesting to reconsider. Given the
lack of total sync order when just using memory barriers, is the JSR-133
Cookbook wrong/outdated in this respect? It doesn't at all deal with the
issue of the sync order, just with the visibility of inter-thread actions.

If a JVM just followed the rules put forth by the Cookbook, would an
invalid execution as outlined in my diagram actually be possible?

On Mon, Mar 16, 2015 at 9:22 PM, Marko Topolnik <marko at hazelcast.com> wrote:

> I was wondering how a JVM implementation would be able to enforce global
> sync order across all CPUs in a performant way. So, given the total
> ordering on lock instructions, I would assume that the implementation of
> any given synchronizing action would have to involve a lock instruction at
> some point.
>
> On Mon, Mar 16, 2015 at 9:02 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>
>> By "total sync order at the CPU level" do you mean sync order of that cpu
>> itself or some global order across all CPUs? The total order of lock
>> instructions is across all CPUs, whereas mfence, AFAIK, orders only the
>> local CPU's memory operations.  Sorry, maybe I'm being dense today, but I
>> still don't get why knowing that lock instructions have total order somehow
>> answers your question.  Were you simply asking whether there are
>> instructions available to ensure memory operations are done (or appear to
>> be) in program (minus compiler code motion) order on a per-cpu basis?
>>
>> On Mon, Mar 16, 2015 at 3:46 PM, Marko Topolnik <marko at hazelcast.com>
>> wrote:
>>
>>> What is important is that there be _some_ way of guaranteeing total sync
>>> order at the CPU level. It is less important whether this is achieved by
>>> mfence or lock instruction.
>>>
>>> -Marko
>>>
>>> On Mon, Mar 16, 2015 at 8:40 PM, Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>
>>>> Why were you concerned with lock instructions specifically? At one
>>>> point in the past, volatile writes were done using mfence, IIRC.
>>>>
>>>> sent from my phone
>>>> On Mar 16, 2015 3:28 PM, "Marko Topolnik" <marko at hazelcast.com> wrote:
>>>>
>>>>> Andrew,
>>>>>
>>>>> thank you for the reference, this answers the dilemma in full. I
>>>>> didn't know this guarantee existed on x86.
>>>>>
>>>>> ---
>>>>> Marko
>>>>>
>>>>> On Mon, Mar 16, 2015 at 7:44 PM, Andrew Haley <aph at redhat.com> wrote:
>>>>>
>>>>>> On 03/16/2015 05:00 PM, Marko Topolnik wrote:
>>>>>>
>>>>>> > Given that, since Nehalem, cores communicate point-to-point over QPI
>>>>>> > and don't lock the global front-side bus, the CPU doesn't naturally
>>>>>> > offer a total ordering of all lock operations.
>>>>>>
>>>>>> Intel do actually guarantee
>>>>>>
>>>>>>     Locked instructions have a total order.
>>>>>>
>>>>>> so this is a hardware problem, not a software one.  How exactly the
>>>>>> hardware people do this on a large network of processors is some of
>>>>>> the most Secret Sauce, but I can imagine some kind of combining
>>>>>> network in hardware.
>>>>>>
>>>>>> Andrew.
>>>>>>
>>>>>> [1]  Intel® 64 and IA-32 Architectures Software Developer’s Manual
>>>>>> Volume 3 (3A, 3B & 3C): System Programming Guide 8.2.2, Memory
>>>>>> Ordering in P6 and More Recent Processor Families
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Concurrency-interest mailing list
>>>>> Concurrency-interest at cs.oswego.edu
>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>
>>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150317/0eabf156/attachment.html>


More information about the Concurrency-interest mailing list