[concurrency-interest] Enforcing total sync order on modern hardware

Oleksandr Otenko oleksandr.otenko at oracle.com
Tue Mar 24 07:42:10 EDT 2015

```17.4.3
> A set of actions is /sequentially consistent/ if all actions occur in
> a total order ...(plus details of how the total order relates to
> program order etc)

It looks like your objection goes against this. IRIW works because it
must have the writes observed by all threads in the same order - due to
the writes and reads forming a total order, even if they are independent
- which is the cornerstone of /sequential consistency/.

I don't know how you measure that the example is minimal, because in
some sense it is also maximal - the minimal case being Dekker's

T1: x=1;
r0=x; // may fuse with x=1 - then you get the canonical form of the example
r1=y;

T2: y=1;
r2=y; // may fuse with y=1
r3=x;

Alex

On 23/03/2015 17:42, Marko Topolnik wrote:
>
>
> On Mar 23, 2015 6:01 PM, "Oleksandr Otenko"
> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
> >
> > IRIW results apply to any thread doing the reading. The existence of
> the fourth thread only generalizes the result.
>
> This is incorrect: the essential property of IRIW is that it
> constructs a result for which no sequentially consistent explanation
> exists. It is the minimal example to reproduce the issue of interest,
> therefore none of its parts is optional.
>
> >
> > It seems this branch of the conversation is pointless.
> >
> > Alex
> >
> >
> > On 23/03/2015 15:56, Marko Topolnik wrote:
> >>
> >> So your analogy to IRIW is established by introducing a whole new
> reading thread. Such an analogy fails to capture the essence of my
> scenario: I am interested precisely in the case where the
> "timer" thread's writes. The goal is to analyze the tension between
> the desire to win performance through store forwarding and the need to
> stay sequentially consistent. It was my impression that the
> distributed nature of QPI messaging would result in the processors
> grabbing more of the liberties allowed by Intel's specification, which
> specifically excludes my scenario from the ordering guarantee. As
> Aleksey pointed out, this is not the case because an MFENCE
> instruction provides a stronger guarantee than that: the coherence
> layer will have resolved the value at the stored location before the
> >>
> >> ---
> >> Marko
> >>
> >> On Mon, Mar 23, 2015 at 1:53 PM, Oleksandr Otenko
> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
> >>>
> >>> Out of all outcomes IRIW permits, choose those that have the
> fourth thread observe 1 then 0 - ie there exists a thread which
> observed Wv1 occur before T9. Now you are looking at your case with
> subset of those in IRIW.
> >>>
> >>>
> >>> Alex
> >>>
> >>>
> >>> On 20/03/2015 22:05, Marko Topolnik wrote:
> >>>>
> >>>> On Fri, Mar 20, 2015 at 7:52 PM, Oleksandr Otenko
> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
> >>>>>
> >>>>> On 20/03/2015 18:12, Marko Topolnik wrote:
> >>>>>>
> >>>>>> On Fri, Mar 20, 2015 at 5:45 PM, Oleksandr Otenko
> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
> >>>>>>>
> >>>>>>> No, that doesn't answer the question. You need to modify how
> happens-before is built - because happens-before in JMM and in some
> other model are two different happens-befores. If you get rid of
> synchronization order, then you need to explain which reads the write
> will or will not synchronize-with.
> >>>>>>
> >>>>>>
> >>>>>> I think it's quite simple: the read may synchronize-with any
> write as long as that doesn't break happens-before consistency.
> >>>>>
> >>>>>
> >>>>> It seems quite naive, too. The problem is that currently the
> read synchronizes-with all writes preceding it, but observes the value
> set by the last write. Here you need to define somehow which write the
> read observes - you need to somehow define which of the writes is
> "last" and what the other readers are allowed to think about it.
> >>>>>
> >>>>> It doesn't seem to be explained in one sentence.
> >>>>
> >>>>
> >>>> It is a quite lightweight exercise to rigorously specify this in
> terms of Lamport's clocks; but I concede that, lacking a shared
> intuition, it will take more than a sentence to communicate. I
> hesitate to turn this into a treatise on the application of Lamport's
> clocks to the JMM, so I'm letting it rest.
> >>>>>>>
> >>>>>>> I am only involved in this discussion because you said it
> isn't IRIW, but I see all signs that it is. I remember the discussion
> here doubting that IRIW should be supported, and I appreciate the
> arguments, but without the specification it is difficult to continue a
> meaningful discussion.
> >>>>>>
> >>>>>>
> >>>>>> That's strange to hear since I have pointed out exactly why
> it's not IRIW: if we broaden the definition such that it covers my
> case, then we must accept that Intel allows IRIW to happen because it
> explicitly excludes the writing thread from the guarantee which is
> supposed to disallow it.
> >>>>>
> >>>>>
> >>>>> Rwt6 and Rrt6 are reduntant. If you remove them, it becomes
> IRIW. Rwt6 only witnesses the particular ordering of some operations
> in IRIW - it forbids some of the outcomes from IRIW, but doesn't add
> new ones. Rrt6 is meaningless.
> >>>>
> >>>>
> >>>> IRIW involves four independent threads and six events. My example
> involves only three threads, so there must be something wrong in
> calling it "exactly IRIW". Apparently you have in mind some quite
> flexible definition of IRIW, but I cannot second-guess what it might be.
> >>>>
> >>>> ---
> >>>> Marko
> >>>
> >>>
> >>
> >
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150324/a782e27b/attachment-0001.html>
```