[concurrency-interest] Double Checked Locking in OpenJDK

Nathan Reynolds nathan.reynolds at oracle.com
Wed Aug 15 18:42:16 EDT 2012


Discussion happening in hardware and software arenas to allow data races 
and errors and have probably correct results.  Right now, there are few 
in the academic world working on this.  Of course, quantum computers 
will behave this way, but I am not talking about that.

Nathan Reynolds 
<http://psr.us.oracle.com/wiki/index.php/User:Nathan_Reynolds> | 
Consulting Member of Technical Staff | 602.333.9091
Oracle PSR Engineering <http://psr.us.oracle.com/> | Server Technology
On 8/15/2012 2:19 PM, Ruslan Cheremin wrote:
> Well, I understand your point.
>
> But this lead me to another interesting question -- why data races is
> so outlaw? Yes, I understand, it is hard to write correct code with
> them (by the way, can you give the link about errors in JMM spec about
> data race you've mention above?), but can we scale really well with
> sequentially consistent execution only? I mean, in large-scale
> distributed systems design weakening consistency often gives great
> performance benefits. AndwWhen I think about something like 786-cores
> (skip the brand) box, it seems for me there could be many chances to
> improve performance using racy code. Am I miss something here?
>
> 2012/8/16 Boehm, Hans <hans.boehm at hp.com>:
>>> From: Ruslan Cheremin [mailto:cheremin at gmail.com]
>>>
>>> As far, as I can see, there is two directions. One is what
>>> "thread-safe" notation in it's commonly used form -- then applied only
>>> to methods, and not to initialization/publication -- is confusing, and
>>> there is little reason to exclude construction/publication from thread
>>> safety protocol by default. My point, among others, is that "immutable
>>> and thread safe" should be interpreted as "thread safe even for unsafe
>>> publishing". And this, for example, gives us the chance to remove
>>> volatile specification in File.path field we started from :)
>> It seems to me that there's a huge difference here.  Synchronization in the constructor only matters in the presence of other dubious programming practices:  Either a reference to the object has to escape before the constructor finishes, or the reference has to be communicated to another thread in a racy manner.  (And the former is a special case under control of the class itself.)  There are strong reasons to avoid both in the vast majority of code.  On the other hand, perfectly normal code will routinely rely on the thread-safety of non-constructor methods all the time.
>>
>>> Second direction is about construction/publication as specifically
>>> different from the methods. E.g. it even may have additional safety
>>> guarantee -- like "publication is always safe". I can see some reasons
>>> here, since constructor is the only one method, which is guaranteed to
>>> be called only once on object lifecycle, and so we, possible, can
>>> restrict some compiler/CPUs optimization in it with little influence
>>> on overall application performance -- but it throws away all troubles
>>> with unsafe publishing.
>> I think we're still generally using "unsafe publishing" to mean either of the two dubious practices I mentioned above, though here we're presumably talking about racy publication after the constructor completes.  The problem is that in general racy publication is already a really bad practice, because the user has to understand the ugly details of the Java memory model, which nobody really does.  Racy publication is a data race, and hence you can no longer reason in terms of sequential consistency, synchronization-free regions become nonatomic, and generally all our intuition about behavior of threads and reasoning about threads goes out the window, even if you can still reason about the integrity of your class.  There are one or two special cases, notably lazy initialization of an immutable object, where you might succeed in hiding all that mess behind a library API, but in general that's hard.  So the question in my mind is whether you want to provide those added guarant!
>   ees to support those one or two cases, or whether you want to limit those guarantees to situations, notably those involving final fields, where they're essential for the security model.  We currently have the latter.
>> Hans
>>>
>>>
>>> 2012/8/15 Boehm, Hans <hans.boehm at hp.com>:
>>>> Agreed.
>>>>
>>>> But, echoing David, I think, I'm not at all sure I see where this
>>> thread is going.  We've established that
>>>> a) You can make a class safe against racy publication by
>>> synchronizing the constructor along with all other methods (or by using
>>> an immutable class with final fields).
>>>> b) There are (rather brittle and obscure) use cases in which racy
>>> publication gives you better performance on architectures like ARM,
>>> though not x86, currently at the cost of confusing data race detectors.
>>>> But to me it seems like taking advanatage of (b) is a fairly
>>> undesirable, though perhaps occasionally unavoidable, hack.  And I
>>> can't see why it would possibly be a win if you have to synchronize all
>>> method calls to make it work.
>>>> Does anyone have a use case in mind where the whole picture we're
>>> discussing actually makes sense?  It might help to focus this
>>> discussion.
>>>> Hans
>>>>
>>>>> -----Original Message-----
>>>>> From: concurrency-interest-bounces at cs.oswego.edu
>>> [mailto:concurrency-
>>>>> interest-bounces at cs.oswego.edu] On Behalf Of Zhong Yu
>>>>> Sent: Wednesday, August 15, 2012 10:56 AM
>>>>> To: Yuval Shavit
>>>>> Cc: concurrency-interest at cs.oswego.edu; dholmes at ieee.org
>>>>> Subject: Re: [concurrency-interest] Double Checked Locking in
>>> OpenJDK
>>>>> I thought the conclusion of that thread is that synchronizing
>>>>> constructor has the desired merit - if all constructors and methods
>>>>> are synchronized, a non-creating thread won't observe the
>>> zero/partial
>>>>> state of the object, even if the object reference is published
>>>>> unsafely.
>>>>>
>>>>> (One guy, who shall remain nameless, muddied the water with some
>>>>> mistaken statements of weaker memory guarantee. He has been
>>> corrected)
>>>>> Zhong Yu
>>>>>
>>>>> On Wed, Aug 15, 2012 at 11:58 AM, Yuval Shavit <yshavit at akiban.com>
>>>>> wrote:
>>>>>> There was a discussion here a few months ago about synchronizing
>>>>>> constructors -- I had asked why it's not allowed, and the
>>> discussion
>>>>> hit on
>>>>>> some of the similar points brought up in this thread.
>>>>>>
>>>>>> But to your point specifically, synchronizing a constructor (via
>>>>>> "synchronized(this) {...}" surrounding its body) still doesn't
>>> give
>>>>> you full
>>>>>> thread safety (even assuming immutability after the constructor --
>>>>> but
>>>>>> without final fields). It ensures that a thread can observe the
>>>>> object
>>>>>> either fully constructed *or* with all its fields having their
>>>>> default
>>>>>> values. In other words, even if your constructor is synchronized
>>> on
>>>>> the same
>>>>>> object your getter is, a thread could observe a field as it was
>>>>> before the
>>>>>> constructor was invoked.
>>>>>>
>>>>>> http://markmail.org/message/mav53xzo4bqu7udw
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 15, 2012 at 12:49 PM, Ruslan Cheremin
>>>>> <cheremin at gmail.com>
>>>>>> wrote:
>>>>>>>> The reason to keep them distinct is because in general the
>>>>> mechanisms
>>>>>>>> for
>>>>>>>> safe publication are external to the class, while those for
>>>>>>>> thread-safety
>>>>>>>> are internal. It is only an edge case where use of synchronized
>>> in
>>>>> a
>>>>>>>> constructor can achieve safe-publication.
>>>>>>> Well, actually I do not understand your point. If I use some kind
>>> of
>>>>>>> synchronization to make methods of my object thread-safe -- can't
>>> I
>>>>>>> also apply same thing to constructor? For me, it makes the thing
>>>>> only
>>>>>>> clearer. Object can be thread-safe -- and it is totally thread
>>> safe.
>>>>>>> Object can require external synchronization for correct
>>>>> multithreaded
>>>>>>> use -- and it requires the sync for publishing and for usage
>>> also.
>>>>>>>  From my point of view, the distinction you talking about is more
>>>>>>> historically reasoned. "Sync method if you want it to be thread-
>>>>> safe"
>>>>>>> is commonly learned mantra, but "take care of initialization
>>> also"
>>>>> is
>>>>>>> not so common. More information about it, more education, more
>>>>>>> different code samples with outlined "here is the dragons" will
>>>>> change
>>>>>>> the situation, I sure, it just have to be highlighted more often.
>>>>>>>
>>>>>>>
>>>>>>>> People have to recognize that sharing an object requires shared
>>>>> mutable
>>>>>>>> state, and the number one tenet of concurrent programming is
>>> that
>>>>> access
>>>>>>>> to
>>>>>>>> shared mutable state has to be synchronized (in a general sense
>>>>> not
>>>>>>>> specifically use of 'synchronized' keyword).
>>>>>>>>
>>>>>>>> Making every object safely publishable could be done, but for
>>> 99%
>>>>> of
>>>>>>>> objects
>>>>>>>> it would be a waste of effort. Programs without data races
>>> don't
>>>>> have
>>>>>>>> issues
>>>>>>>> with unsafe publication.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: concurrency-interest-bounces at cs.oswego.edu
>>>>>>>> [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of
>>>>> Nathan
>>>>>>>> Reynolds
>>>>>>>> Sent: Wednesday, 15 August 2012 4:59 AM
>>>>>>>> To: concurrency-interest at cs.oswego.edu
>>>>>>>> Subject: Re: [concurrency-interest] Double Checked Locking in
>>>>> OpenJDK
>>>>>>>> We seem to be splitting two notions (i.e thread-safe and safe
>>>>>>>> publication)
>>>>>>>> when they should be combined in a sense.  Typically, when we
>>> say
>>>>>>>> thread-safe
>>>>>>>> we talk about the operations performed on the object after it
>>> was
>>>>>>>> constructed (and its contents are globally visible).  However,
>>> we
>>>>> need
>>>>>>>> to
>>>>>>>> consider that executing the constructor is modifying the state
>>> of
>>>>> the
>>>>>>>> object.  It requires the same mechanisms that the rest of the
>>>>> class uses
>>>>>>>> to
>>>>>>>> ensure thread-safety.  Even though, there is only 1 thread
>>>>> executing the
>>>>>>>> constructor, a proper releasing of a lock or some other
>>> happens-
>>>>> before
>>>>>>>> construct is required to ensure that the memory updates by the
>>>>> thread
>>>>>>>> are
>>>>>>>> made globally visible before the object is accessed by another
>>>>> thread.
>>>>>>>> This
>>>>>>>> is what we are calling safe publication.  So, safe publication
>>> is
>>>>> a
>>>>>>>> subset
>>>>>>>> of thread-safety except it is limited to what happens after the
>>>>>>>> constructor
>>>>>>>> is called and before the object is used by multiple threads.
>>>>>>>>
>>>>>>>> A beautifully-written class can be thread-safe with respect to
>>>>> calling
>>>>>>>> its
>>>>>>>> member methods but not thread-safe with respect to calling its
>>>>>>>> constructor.
>>>>>>>> It is this latter case that many stumble upon because they
>>> think
>>>>> that
>>>>>>>> constructors are inherently thread-safe because they are
>>> executed
>>>>>>>> single-threadedly.  What they fail to realize is that the
>>>>> execution of a
>>>>>>>> constructor can overlap with the execution of other code from
>>> the
>>>>> view
>>>>>>>> point
>>>>>>>> of what is happening in memory.  This same problem applies to
>>> more
>>>>> rare
>>>>>>>> case
>>>>>>>> of regular methods which can be proven to execute in a single
>>>>> thread but
>>>>>>>> don't use synchronization before multiple threads start
>>> accessing
>>>>> the
>>>>>>>> shared
>>>>>>>> data.
>>>>>>>>
>>>>>>>> Nathan Reynolds | Consulting Member of Technical Staff |
>>>>> 602.333.9091
>>>>>>>> Oracle PSR Engineering | Server Technology
>>>>>>>> On 8/13/2012 4:08 PM, David Holmes wrote:
>>>>>>>>
>>>>>>>> Ruslan Cheremin writes:
>>>>>>>>
>>>>>>>> For me it is confusing: java has only one way to have really
>>>>> immutable
>>>>>>>> object, and this way also gives you a total thread safety even
>>> for
>>>>>>>> data race based publication. But then docs refer object as
>>>>> "immutable
>>>>>>>> and thread-safe" -- we still can't assume it to be really
>>> thread-
>>>>> safe?
>>>>>>>> It is better/simpler to isolate the notion of thread-safety and
>>>>> safe
>>>>>>>> publication. Thread-safety comes into play after you have
>>> safely
>>>>> shared
>>>>>>>> an
>>>>>>>> object. The means by which you safely share an object is
>>>>> orthogonal to
>>>>>>>> how
>>>>>>>> the object itself is made thread-safe.
>>>>>>>>
>>>>>>>> The means by which an object is shared has to involve shared
>>>>> mutable
>>>>>>>> state,
>>>>>>>> and use of shared mutable state always needs some form of
>>>>>>>> synchronization
>>>>>>>> (either implicit eg due to static initialization; or explicit
>>> by
>>>>> using
>>>>>>>> volatile or synchronized getter/setter methods).
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>> It's a pity, especially because true immutability gives us some
>>>>>>>> chances of performance optimization. As in this case -- we do
>>> not
>>>>>>>> really need .path to be volatile here, if we would assume Path
>>> to
>>>>> be
>>>>>>>> truly immutable. volatility here required only for ensuring
>>> safe
>>>>>>>> publishing.
>>>>>>>>
>>>>>>>> 2012/8/13 David Holmes <davidcholmes at aapt.net.au>:
>>>>>>>>
>>>>>>>> Ruslan Cheremin writes:>
>>>>>>>>
>>>>>>>> But is there a way to define "safe for data race publishing"? I
>>> as
>>>>>>>> far, as I remember, "immutable and thread-safe" is standard
>>> mantra
>>>>> in
>>>>>>>> JDK javadocs for totally safe objects. j.l.String has same
>>> mantra
>>>>> --
>>>>>>>> and it is safe for any way of publishing. Does you mean, I
>>> should
>>>>>>>> explicitly add "safe even for publishing via data race" in
>>> docs?
>>>>> But I
>>>>>>>> can't remember any such phrase in JDK docs.
>>>>>>>>
>>>>>>>> I don't recall anything in the JDK docs that mention being
>>>>>>>>
>>>>>>>> "totally safe"
>>>>>>>>
>>>>>>>> regardless of publication mechanism. Some classes, eg String,
>>> have
>>>>> been
>>>>>>>> defined such that they do have that property (for security
>>>>> reasons). In
>>>>>>>> general neither "thread-safe" nor "immutable" imply
>>>>>>>> safe-for-unsynchronized-publication.
>>>>>>>>
>>>>>>>> Java Concurrency In Practice (jcip.net) does define additional
>>>>> potential
>>>>>>>> annotations, where @Immutable would indeed capture the
>>> requirement
>>>>> of
>>>>>>>> safe-for-unsynchronized-publication.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>> 2012/8/13 David Holmes <davidcholmes at aapt.net.au>:
>>>>>>>>
>>>>>>>> Ruslan Cheremin writes:
>>>>>>>>
>>>>>>>> Well, Path javadoc explicitly says "immutable and safe for
>>>>>>>> multithreaded use". Although it is not strictly defined in java
>>>>> what
>>>>>>>> exactly means "safe for multithreaded use" -- does it mean safe
>>>>> for
>>>>>>>> publishing via data race, among others? -- I suppose, it
>>>>>>>>
>>>>>>>> should be. Am
>>>>>>>>
>>>>>>>> I wrong here?
>>>>>>>>
>>>>>>>> "safe for multi-threaded use" does not generally imply that it
>>>>>>>>
>>>>>>>> is safe to
>>>>>>>>
>>>>>>>> publish instances without synchronization of some form.
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>  From other side, File.toPath javadoc explicitly says what
>>>>> "returned
>>>>>>>> instance must be the same for every invocation", so sync block
>>> is
>>>>>>>> required here for mutual exclusion on initialization phase.
>>>>> Without
>>>>>>>> this requirement it is also safe to live without sync block,
>>>>> afaik.
>>>>>>>> 2012/8/13 David Holmes <davidcholmes at aapt.net.au>:
>>>>>>>>
>>>>>>>> Ruslan Cheremin writes:
>>>>>>>>
>>>>>>>> First of all, Path is immutable, so DCL is safe here even
>>> without
>>>>>>>> volatile. Volatile here is not required from my point of view.
>>>>>>>>
>>>>>>>> Without the volatile the Path implementation (Path is an
>>>>>>>>
>>>>>>>> interface) must be
>>>>>>>>
>>>>>>>> such that an instance of Path can be safely published without
>>>>>>>>
>>>>>>>> any additional
>>>>>>>>
>>>>>>>> forms of synchronization. Immutability does not in itself
>>>>>>>>
>>>>>>>> ensure that. You
>>>>>>>>
>>>>>>>> would have to examine the actual implementation class.
>>>>>>>>
>>>>>>>> David Holmes
>>>>>>>> ------------
>>>>>>>>
>>>>>>>> 2012/8/12 Dmitry Vyazelenko <vyazelenko at yahoo.com>:
>>>>>>>>
>>>>>>>> Hi Richard,
>>>>>>>>
>>>>>>>> The variable "filePath" is volatile, so the double-checked
>>>>>>>>
>>>>>>>> locking is correct in this case. It would have been a bug
>>>>>>>>
>>>>>>>> prior to Java 5.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Dmitry Vyazelenko
>>>>>>>>
>>>>>>>> On Aug 12, 2012, at 21:35 , Richard Warburton
>>>>>>>>
>>>>>>>> <richard.warburton at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The current implementation of java.io.File::toPath [0]
>>>>>>>>
>>>>>>>> appears to be
>>>>>>>>
>>>>>>>> using the double checked locking pattern:
>>>>>>>>
>>>>>>>>      public Path toPath() {
>>>>>>>>          Path result = filePath;
>>>>>>>>          if (result == null) {
>>>>>>>>              synchronized (this) {
>>>>>>>>                  result = filePath;
>>>>>>>>                  if (result == null) {
>>>>>>>>                      result =
>>>>>>>>
>>>>>>>> FileSystems.getDefault().getPath(path);
>>>>>>>>
>>>>>>>>                      filePath = result;
>>>>>>>>                  }
>>>>>>>>              }
>>>>>>>>          }
>>>>>>>>          return result;
>>>>>>>>      }
>>>>>>>>
>>>>>>>> I was going to report the bug, but I'm a little
>>>>>>>>
>>>>>>>> uncertain of the
>>>>>>>>
>>>>>>>> interaction between the local variable 'result' and DCL
>>>>>>>>
>>>>>>>> since I've
>>>>>>>>
>>>>>>>> previously only seen the checking condition on the
>>>>>>>>
>>>>>>>> shared field
>>>>>>>>
>>>>>>>> itself.  Can someone here either confirm that its a bug or
>>>>>>>>
>>>>>>>> explain how
>>>>>>>>
>>>>>>>> the 'result' variable is fixing things?
>>>>>>>>
>>>>>>>> regards,
>>>>>>>>
>>>>>>>>   Richard
>>>>>>>>
>>>>>>>> [0] See the end of
>>>>>>>>
>>>>>>>>
>>> hg.openjdk.java.net/jdk8/jdk8/jdk/file/da8649489aff/src/share/clas
>>>>>>>> ses/java/io/File.java
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Concurrency-interest mailing list
>>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Concurrency-interest mailing list
>>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Concurrency-interest mailing list
>>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Concurrency-interest mailing list
>>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Concurrency-interest mailing list
>>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Concurrency-interest mailing list
>>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Concurrency-interest mailing list
>>>>>> Concurrency-interest at cs.oswego.edu
>>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>>>
>>>>> _______________________________________________
>>>>> Concurrency-interest mailing list
>>>>> Concurrency-interest at cs.oswego.edu
>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>> _______________________________________________
>>>> Concurrency-interest mailing list
>>>> Concurrency-interest at cs.oswego.edu
>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120815/db10b003/attachment-0001.html>


More information about the Concurrency-interest mailing list