[concurrency-interest] Enforcing total sync order on modern hardware

Vitaly Davidovich vitalyd at gmail.com
Mon Mar 16 16:02:52 EDT 2015

By "total sync order at the CPU level" do you mean sync order of that cpu
itself or some global order across all CPUs? The total order of lock
instructions is across all CPUs, whereas mfence, AFAIK, orders only the
local CPU's memory operations.  Sorry, maybe I'm being dense today, but I
still don't get why knowing that lock instructions have total order somehow
answers your question.  Were you simply asking whether there are
instructions available to ensure memory operations are done (or appear to
be) in program (minus compiler code motion) order on a per-cpu basis?

On Mon, Mar 16, 2015 at 3:46 PM, Marko Topolnik <marko at hazelcast.com> wrote:

> What is important is that there be _some_ way of guaranteeing total sync
> order at the CPU level. It is less important whether this is achieved by
> mfence or lock instruction.
> -Marko
> On Mon, Mar 16, 2015 at 8:40 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>> Why were you concerned with lock instructions specifically? At one point
>> in the past, volatile writes were done using mfence, IIRC.
>> sent from my phone
>> On Mar 16, 2015 3:28 PM, "Marko Topolnik" <marko at hazelcast.com> wrote:
>>> Andrew,
>>> thank you for the reference, this answers the dilemma in full. I didn't
>>> know this guarantee existed on x86.
>>> ---
>>> Marko
>>> On Mon, Mar 16, 2015 at 7:44 PM, Andrew Haley <aph at redhat.com> wrote:
>>>> On 03/16/2015 05:00 PM, Marko Topolnik wrote:
>>>> > Given that, since Nehalem, cores communicate point-to-point over QPI
>>>> > and don't lock the global front-side bus, the CPU doesn't naturally
>>>> > offer a total ordering of all lock operations.
>>>> Intel do actually guarantee
>>>>     Locked instructions have a total order.
>>>> so this is a hardware problem, not a software one.  How exactly the
>>>> hardware people do this on a large network of processors is some of
>>>> the most Secret Sauce, but I can imagine some kind of combining
>>>> network in hardware.
>>>> Andrew.
>>>> [1]  Intel® 64 and IA-32 Architectures Software Developer’s Manual
>>>> Volume 3 (3A, 3B & 3C): System Programming Guide 8.2.2, Memory
>>>> Ordering in P6 and More Recent Processor Families
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150316/285c7ecf/attachment.html>

More information about the Concurrency-interest mailing list