[concurrency-interest] Relativity of guarantees provided by volatile

Marko Topolnik mtopolnik at inge-mark.hr
Sat Aug 18 03:35:12 EDT 2012


In this e-mail I'm chunking my responses to several people due to the timezone differences... actually an excellent example to demonstrate my point.

I claim to have woken up ten minutes ago and read all the replies in this thread. I immediately started writing up my answer and will soon send it.

I could also have been following the thread through the night, writing up replies to several people, but in the end decided to chunk them together in a single post. Nobody can prove it, though.

In the end, all your posts happen-before this reply, but in the wall-clock sense parts of my reply may have been written 4 hours ago. Perhaps I just delayed its visibility to you until now. 


Nathan, the JLS is very careful to define what it means by "subsequent", for example 17.4.4: "where 'subsequent' is defined according to the synchronization order". Basically, if you take any real execution, note the wall-clock ordering of actions, and deduce the synchronization ordering from it, you can hold that sync order fixed and construct executions with wall-clock times shifted any way you like and the JLS will not give you a frown. Even the physical logic of cause-precedes-effect can be broken, the JLS won't care as long as there is a single, consistent synchronization order to it all.

Vitaly, a JVM could easily make sure that chunking is valid at least in the simple case where it can prove there's a section of code that makes a sequence of write-y actions with no intervening read-y action. I define write-y and read-y as the origin, respectively target, of a synchronizes-with edge as defined by the JLS. Of course, if you imagine a very specific hardware architecture, this may turn out not to be practical or lead to any performance advantages. But this discussion is not as specific as that.

Yuval, the existing semantics of volatile are very useful as they are, and when you add the informal promise of best effort towards timeliness, it's all you can realistically demand from an implementation. However, what is troubling is the belief of practically every developer out there that there's a hard realtime GUARANTEE of the instantaneous visibility of volatile writes.  The architectures of today prove them right, but if that changes, for all we know many existing programs may start to break. See by analogy what happened when early Java developers were assuming guarantees of in-order observance of inter-thread actions, where they were suddenly proven wrong by the next-generation, aggressively optimized JVMs. In fact, a great many devs even today assume such guarantees and keep writing faulty code.

For reference, you can also have a look at my answer on StackOverflow containing several complete Java examples that help make this discussion more concrete:

http://stackoverflow.com/a/11765634/1103872


-Marko


On 18. kol. 2012., at 03:39, Yuval Shavit wrote:

> I was thinking about this on my way home, and I think Marko's initial proposition is correct -- but that we shouldn't concern ourselves with it.
> 
> I still think it comes down to the meaning of "subsequent," but I think wall-clock is not a good definition. In a multi-core machine, it's conceivable that two actions would happen close enough that it's impossible to tell which one happened first (by wall clock time). Forcing a consensus about which came first would be expensive, and it would probably be prohibitive on a many-core machine -- let alone a multi-node JVM (I was only partially joking about that before).
> 
> So, what happens if we have some loose definition of "subsequent," probably with transitivity and perhaps some other basic properties? If we do that, then Marko's initial proposition holds. For instance, the JVM would be allowed to have a while(someVolatileBool) thread go on forever, having been determined to be subsequent all other actions.
> 
> But there are lots of things which the JVM is allowed to take its time on, but shouldn't. I haven't read anything in the JLS that requires "int i = 1" to complete before the heat death of the universe, and yet we would surely file a bug if it were not nearly instantaneous.
> 
> I would argue that subsequent-ness is the same. There's nothing to prevent the JLS from delaying actions in some horrid way (w.r.t. "subsequent" and thus visibility), but we would surely file a bug if it were not extremely quick on a single-node machine, and appropriately quick on a distributed JVM.
> 
> On Aug 17, 2012, at 9:24 PM, "Boehm, Hans" <hans.boehm at hp.com> wrote:
> 
>> My reading is that “subsequent” in chapter 17 of the spec refers to synchronization order, except in one case in which it refers to a hypothetical sequentially consistent memory model.
>>  
>> I agree that the originally described behavior is allowed.   There’s another complication here that can explain this behavior (as can delayed stores).  AFAIK, the spec does not clearly require that  “wall clock time” is perfectly synchronized across threads, and implementations may, for good reason, fail to ensure that, particularly in this case.  We had this discussion in connection with C++11, and that spec does require that a specific clock is consistent with happens-before.  But even that spec, which is strict enough to raise some controversy, isn’t sufficient here, since there is no happens-before relation between the threads.
>>  
>> Hans
>>  
>> From: concurrency-interest-bounces at cs.oswego.edu [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf Of David Holmes
>> Sent: Friday, August 17, 2012 5:03 PM
>> To: Yuval Shavit; Vitaly Davidovich
>> Cc: concurrency-interest at cs.oswego.edu
>> Subject: Re: [concurrency-interest] Relativity of guarantees provided by volatile
>>  
>> Ah I knew there was something about "immediate" in there somewhere :) So if instructions were immediate (ie they completed as soon as they started) then wall clock observations would be consistent with actual executions. But on real systems instructions are not immediate so you have to take the completion of the instruction as the point at which to make a wall clock observation.
>>  
>> Aside: as well as being atomic and immediate you also need to preclude two instructions from executing simultaneously :)
>>  
>> David (now departing to enjoy the weekend :) )
>> -----Original Message-----
>> From: Yuval Shavit [mailto:yshavit at akiban.com]
>> Sent: Saturday, 18 August 2012 9:49 AM
>> To: Vitaly Davidovich
>> Cc: dholmes at ieee.org; concurrency-interest at cs.oswego.edu
>> Subject: Re: [concurrency-interest] Relativity of guarantees provided by volatile
>> 
>> (Resending -- I'd accidentally sent this only to Nathan)
>> 
>> So, Nathan, you're essentially saying that "subsequent" is defined in terms of clock time. If we take that as the definition, then I agree that it works as we'd all expect. Furthermore, I agree that that's the intuitive definition, as well as the de-facto one. But it doesn't seem to be the one explicitly defined by the JLS. (And if it is the one implicitly defined by the JLS, I there's a serious bug waiting for any distributed-node JVM whose nodes are traveling away from each other at relativistic speeds! ... though I'm no physics expert.)
>>  
>> That said, 17.4.3 does imply that the reads will be viewable in a wall-clock-sequential way, albeit informally
>>  
>>     Sequential consistency is a very strong guarantee that is made about visibility and ordering in an execution of a program. Within a sequentially consistent execution, there is a total order over all individual actions (such as reads and writes) which is consistent with the order of the program, and each individual action is atomic and is immediately visible to every thread.
>>  
>> (emphasis on "is immediately visible")
>>  
>> On Fri, Aug 17, 2012 at 7:41 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>> David and you may be right in the theoretical aspect.  In practice, I can't fathom how a JVM can do this type of analysis.  That's an issue that I have with JMM's wording of happens-before -- it doesn't translate to reality, it seems like.
>> 
>> Sent from my phone
>> 
>> On Aug 17, 2012 7:30 PM, "Yuval Shavit" <yshavit at akiban.com> wrote:
>> Sure, but it could decide that the execution order is [w1, w2, w3, r]. In fact, as far as we know, the thread may have been scheduled such that that was the clock-time ordering, too.
>> 
>> On Fri, Aug 17, 2012 at 7:21 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>> Two volatile writes emit a store-store barrier in between, which to me means they cannot be collapsed and must be made visible in that order (on non-TSO this would require a h/w fence).  In other words, I don't think compiler can remove the redundant stores as if this was a non-volatile field, where it's a perfectly valid (and good) optimization.
>> 
>> Sent from my phone
>> 
>> On Aug 17, 2012 7:18 PM, "David Holmes" <davidcholmes at aapt.net.au> wrote:
>> How does it violate the JMM? There is nothing to establish that any read has to have occurred prior to w=3. An external observer may say "hey if we'd actually written w=1 at this point then the read would see 1" but that is irrelevant. The program can not tell the other writes did not occur.
>>  
>> David
>> -----Original Message-----
>> From: Vitaly Davidovich [mailto:vitalyd at gmail.com]
>> Sent: Saturday, 18 August 2012 9:12 AM
>> To: dholmes at ieee.org
>> Cc: Marko Topolnik; concurrency-interest at cs.oswego.edu
>> Subject: Re: [concurrency-interest] Relativity of guarantees provided by volatile
>> 
>> I don't think the writes to w can be reduced to just the last one as it would violate the JMM.  R may only see the last one due to interleaving though. Not sure if that's what you meant.
>> 
>> Sent from my phone
>> 
>> On Aug 17, 2012 7:03 PM, "David Holmes" <davidcholmes at aapt.net.au> wrote:
>> Hi Marko,
>> 
>> I think the "surprise" is only in the way you formulated this. Said another
>> way a write takes a finite amount of time from when the instruction starts
>> to execute to when the store is actually available for a read to see.
>> (Similarly a read takes a finite amount of time.) So depending on those two
>> times a read and write that happen "around the same time" may appear to have
>> occurred in either order. But when you program with threads you never know
>> the relative interleavings (or should never assume) so it makes no
>> difference how the program perceives the order compared to how some external
>> observer might perceive it.
>> 
>> As for your optimization to "chunk" volatile writes, I don't see a problem
>> here if you are basically asking if given:
>> 
>> w = 1;  // w is volatile
>> w = 2;
>> w = 3;
>> 
>> that this could be reduced to the last write alone? I see no reason why not.
>> Without some additional coordination between a reader thread and the writer
>> thread, reading w==3 is a legitimate outcome. If you are thinking about how
>> the hardware might chunk things then that is a different matter. We have to
>> use the hardware in a way that complies with the memory model - if the
>> hardware can't comply then you can't run Java on it.
>> 
>> David Holmes
>> ------------
>> 
>> > -----Original Message-----
>> > From: concurrency-interest-bounces at cs.oswego.edu
>> > [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of Marko
>> > Topolnik
>> > Sent: Saturday, 18 August 2012 7:24 AM
>> > To: concurrency-interest at cs.oswego.edu
>> > Subject: [concurrency-interest] Relativity of guarantees provided by
>> > volatile
>> >
>> >
>> > Consider the following synchronization order of a program
>> > execution involving a total of two threads, R and W:
>> >
>> > - thread R begins;
>> >
>> > - thread R reads a volatile int sharedVar several times. Each
>> > time it reads the value 0;
>> >
>> > - thread R completes;
>> >
>> > - thread W begins;
>> >
>> > - thread W writes the sharedVar several times. Each time it
>> > writes the value 1;
>> >
>> > - thread W completes.
>> >
>> > Now consider the wall-clock timing of the events:
>> >
>> > - thread R reads 0 at t = {1, 4, 7, 10};
>> > - thread W writes 1 at t = {0, 3, 6, 9}.
>> >
>> > As far as the Java Memory Model is concerned, there is no
>> > contradiction between the synchronization order and the
>> > wall-clock times, as the JMM is wall-clock agnostic. However, I
>> > have yet to meet a single Java professional who wouldn't at least
>> > be very surprised to hear that the specification allows this.
>> >
>> > I understand that the SMP architecture that dominates the world
>> > of computing today practically never takes these liberties and
>> > makes the volatile writes visible almost instantaneously. This
>> > may change at any time, however, especially with the advent of
>> > massively parrallel architectures that seem to be the future. For
>> > example, an optimization technique may choose to chunk many
>> > volatile writes together and make them visible in a single bulk
>> > operation. This can be safely done as long as there are no
>> > intervening read-y actions (targets of the synchronizes-with
>> > edges as defined by JLS/JSE7 17.4.4).
>> >
>> > Now, my questions are:
>> >
>> > 1. Is there a loophole in my reasoning?
>> >
>> > 2. If there is no loophole, is there anything to worry about,
>> > given that practically 100% developers out there consider as
>> > guaranteed something that isn't?
>> >
>> >
>> > -Marko
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Concurrency-interest mailing list
>> > Concurrency-interest at cs.oswego.edu
>> > http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>> >
>> 
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>> 
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>> 
>>  
>>  
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest




More information about the Concurrency-interest mailing list