[concurrency-interest] Should I avoid compareAndSet with value-based classes?

Gil Tene gil at azul.com
Thu Jul 6 14:15:28 EDT 2017


David, the below is a (potentially good) argument for not allowing the creation of subclasses of Object that do not have well defined support identity, identity comparison, identity hashing, or synchronization, since Object supports all those, things that hold Object instances may very well make use of them.

But that horse has long fled the barn. We can argue that value-based classes should not have been specified in Java 8. Or that the future value types should not be derived from Object. But the first case is already in the Java spec, and the second is likely coming (and is sneakily able to use the first as a precedent).

For the "how should we specify things?" discussion, I'm actually squarely on the side of "things with no identity should never have been derived from Object" of this argument. E.g. I would advise other language specifications to avoid making the same mistake, either by not giving the base Object class identity to begin with, or by creating a separate orthogonal type hierarchy for value types and "value based" classes. However, I don't see a way to "fix" the mistake already made, and we have to deal with the language and the machine as spec'ed, not as we wish it were. And so does anyone writing code in the language.

In the context of that reality, we are discussing what correct uses of those new "basterd children of Object" are. And since the basterds clearly have no identity, using any logic that relies on identity behavior in any way is wrong.

Sent from my iPad

On Jul 6, 2017, at 10:53 AM, Gil Tene <gil at azul.com<mailto:gil at azul.com>> wrote:

Hence my suggestion that identity-based operations on instances of value-based classes should throw exceptions... Right now they silently do unpredictable things, and should be avoided in any case (but harder to know/find when you accidentally use them).

Sent from my iPad

On Jul 6, 2017, at 10:46 AM, Alex Otenko <oleksandr.otenko at gmail.com<mailto:oleksandr.otenko at gmail.com>> wrote:

Are they subclasses of Object or not? If they are not, no questions. If they are, someone still needs to explain how exactly the “identity based operations” can be avoided safely.

List<j.u.Optional> is just a List underneath, and searching for an item may use “identity based operations”, whether you want it or not - depending on what implementation of List you happen to use.

Alex

On 6 Jul 2017, at 17:33, Gil Tene <gil at azul.com<mailto:gil at azul.com>> wrote:

It says: "Use of such identity-sensitive operations on instances of value-based classes may have unpredictable effects and should be avoided." It does not say what the reason for the unpredictable effect may be and WHEN you shouldn't use identity based operations, it warns you not to use them for any reason whatsoever. Can't be much more clear than that, and you don't need to be a JVM or compiler implementor to understand it. It means "don't do that". And "if you do that, very surprising things may (or may not) happen".

Acting on a current understanding and perception of what implementations actually do, and saying "I thought that what it really means is that using identity in some way X would be unreliable because of effect Y, but using identity in case Z is ok, because I don't see why it shouldn't be" is a wrong reading of this clear admonition to not use identity for anything. Doing so will come back to bite you, either now or in the future.

For example, instances of value-based classes have the [well documented] quality of being "freely substitutable when equal, meaning that interchanging any two instances x and y that are equal according to equals() in any computation or method invocation should produce no visible change in behavior." The JVM may (and likely will at some point will) use this quality freely in some optimizations, A simple and obviously valid resulting optimization would be for == to always be evaluated as false when the operands are instances of value-based classes: since the JVM *may* freely substitute the instances of either side of the == evaluation with some instance that is equal according to equals(), it can choose to do so in all evaluations of ==, which would mean that the two sides are never the same instance, and would then mean that the JVM can treat all code that would only execute if == were true as dead code. Similarly, != may trivially be evaluated as always-true. And since these optimizations may or may not be applied in different places and times, they may lead to seemingly random evaluation of == or != for identical code on identical data (for the exact same, unmodified values of a and d, a will sometimes be equal to d and sometimes not, and at the same point in code, depending on mood).

We could further discuss whether or not the JVM is allowed to "falsely" indicate that == is true (or that != is false) even when the instances differ in value (so are not .equals()). It is. Doing that may certainly surprise anyone who uses the identity based == for anything to do with instances of value-based classes, even tho using it would be silly given that == might obviously be always-false. For example, I could see how someone may mistakenly try to use == as an "optimization" on the assumption that if they get lucky and == is true, .equals() must be true as well, and that evaluating == might be cheaper, but they would still "do the right thing" in the != case. But that would be a mistake, and a clear violation of the "don't do that" admonition above. The cases where the result of == can be *unpredictable* are not limited to cases where the JVM freely substitutes instances for other equal() ones, that's just one example of how the JVM may obviously and validly use one known quality of value-based classes. But another key quality of instances of value based classes is that they "are considered equal solely based on equals(), not based on reference equality (==)". There is nothing there that limits us to "if == is true, equals() must be true as well". It may be convenient to think otherwise when wanting to explain how breaking the rules in some specific way is still ok (and currently seems to work). But it's not. Bottom line: The JVM *may* choose to make == always-true. Nothing wrong with that.

It is dangerous (and nearly impossible) to deduce which optimizations may or may not happen when they are clearly allowed by the specified set of rules and qualities. The defined qualities are what matter, and optimizations that indirectly result from the propagation of those qualities are allowed and may (or may  not) happen. Instance identity either exists or doesn't, and instance Identity checks either have meaning, or they don't. If they do, certain things follow (like .equals() always being true if the identity of the twin instances is the same, as tested by ==). If they don't, then identity checks are meaningless, and can be discarded or conveniently evaluated in ways that seem profitable.

For "instances" of value-based classes, instance identity does not exist. Unfortunately, unlike value types, where the operand == is defined to act on value and not on identity, the operand == is undefined for instances of value-based classes. It certainly does not mean "if true, the identity of these things is the same", and it certainly doesn't mean "if false, these things are not equal". But it also doesn't mean "if true, these things are equal". It is *undefined*, specified with "may have unpredictable effects", and users are clearly told not to use it for anything.

One can argue that the spec should change, e.g. add a statement that would have the effect of requiring "if == is true then .equals() is also true", and would prevent or limit certain unpredictable effects when considering the identity of identity-less things, but I suspect that writing that down in actual words would be painful given the necessary "there is no meaning to identify, but..." logic. We could also look to redefine the meaning of certain operands, e.g. == and != can be dealt with as they are in value types, but compare in compareAndSet is more challenging...

This is [partly] why I think that throwing an exception (or even a compile-time error where possible) when encountering == or != operations (and any of the other undefined behavior ones) on instances of value-based classes is a good idea, and something that should start happening ASAP. Any such operation is wrong (literally) by definition, and the "may have unpredictable effects" statement in the spec is certainly wide enough to allow exception throwing. It is wide enough to allow much worse things to happen, and silent unpredictable and undefined behavior, while allowed, is worse than preventing the code from executing. The same way we wouldn't argue that synchronized(int) should simply be "undefined" but silently allowed to run with unpredictable effects, we shouldn't argue that synchronized(LocalDateTime) should.

Sent from my iPad

On Jul 6, 2017, at 3:04 AM, Alex Otenko <oleksandr.otenko at gmail.com<mailto:oleksandr.otenko at gmail.com>> wrote:

:-) the one causing Byzantine failures!

Given how underspecified the things generally are, and that the target audience of such javadoc is not a JVM implementor, we shouldn’t read too much into the possible freedoms the vague wording seems to imply. It shouldn’t suddenly break the promises about identity equality for specific instances.

All it’s saying is that, like Integer.from(10) may or may not return the same instance. It may or may not work the same way throughout the JVM’s lifetime, allowing the implementors to choose suitable caching strategies, code optimizations, etc - therefore should not rely on comparing identities, and such; for example, synchronized(Integer.from(10)) does not guarantee neither lock freedom, nor deadlock freedom. It should not say that suddenly it is unsafe to compare identities (like Gil suggests JVM could start throwing exceptions). It should not say that suddenly we shouldn’t be able to CAS (like Gil says suddenly the reference to instance can sneakily be replaced with something else).


Alex


On 6 Jul 2017, at 10:12, Millies, Sebastian <Sebastian.Millies at softwareag.com<mailto:Sebastian.Millies at softwareag.com>> wrote:

just out of curiosity: I am familiar with the term “Byzantine failure”, but what is a “Byzantine optimization”?
•  Sebastian

From: Concurrency-interest [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf Of Alex Otenko
Sent: Thursday, July 06, 2017 10:17 AM
To: Gil Tene
Cc: concurrency-interest at cs.oswego.edu<mailto:concurrency-interest at cs.oswego.edu>
Subject: Re: [concurrency-interest] Should I avoid compareAndSet with value-based classes?


On 6 Jul 2017, at 08:47, Gil Tene <gil at azul.com<mailto:gil at azul.com>> wrote:



Sent from my iPad

On Jul 6, 2017, at 12:38 AM, Alex Otenko <oleksandr.otenko at gmail.com<mailto:oleksandr.otenko at gmail.com>> wrote:
All it is saying is:

  LocalDateTime a = LocalDateTime.parse("2007-12-03T10:15:30");
  LocalDateTime b = LocalDateTime.parse("2007-12-03T10:15:30");
  LocalDateTime c = LocalDateTime.parse("2007-12-03T10:15:30");

a==b && b==c can be true and can be false

It also means that even when:

  LocalDateTime d = a;
  ...
  (a == d) may or may not be true. And may change whether it is true or not at any time,

I meant an even stronger assertion:

assert (a==b) == (b==a) : "No Byzantine optimizations"
assert (a==d) == (a==d): “No Byzantine optimizations"

Alex





Alex

On 6 Jul 2017, at 08:28, Gil Tene <gil at azul.com<mailto:gil at azul.com>> wrote:



Sent from my iPad

On Jul 5, 2017, at 11:51 PM, Henrik Johansson <dahankzter at gmail.com<mailto:dahankzter at gmail.com>> wrote:
Oh, without having followed the value type discussions I think it was a mistake to not "fix" equality. Why not make it a deep comparison if the reference is different? If it points to the same object we are done otherwise start checking the struct content.
There may be a lot I missed here but a new type of object could be allowed to have different meaning equality. Right?
.equals() means what you want it to mean. == and != (and the compare in compareAndSet) mean very specific things, and cannot be overridden.

For non-reference value types (int, long, char, etc.), == and != are value comparisons. An int has no identity. Just a value:
  int a = 5;
  int b = 5;
  boolean y = (a == b);  // true

For references to instances of (non value-based) classes, == and != can be thought of as comparing the value of the reference (and not the contents of the object instances). This is an identity comparison, which ignores values within the object:
  Integer a = new Integer(5);
  Integer b = new Integer(5);
  boolean x = a.equals(b);   // true
  boolean y = (a == b);   // false

And for references to value-based classes (which is a relatively new thing, but is part of Java 8), the meaning of == and != appears to be undefined. E.g.:

  LocalDateTime a = LocalDateTime.parse("2007-12-03T10:15:30");
  LocalDateTime b = LocalDateTime.parse("2007-12-03T10:15:30");
  boolean x = a.equals(b);   // true
  boolean y = (a == b);   // unpredictable, undefined, who knows.
                                         // Could be true, could be false.
                                         // Could theoretically change the values of a or b, or of something else




On Thu, 6 Jul 2017, 07:12 Gil Tene, <gil at azul.com<mailto:gil at azul.com>> wrote:

I'd take that documentation seriously. It basically says that ==, !=, synchronization, identity hashing, and serialization are undefined behaviors.

While the *current* implementations may carry some semi-intuitive behvaiors, e.g. where == indicates true when comparing two references to instances of a value-based class where the value of the references is the same, there is no guarantee that at some point in the [near or far] future that behavior will remain. Specifically, attempting == (or !=, or synchronization, etc., including compareAndSet) on a reference to a value based class is allowed to do ANYTHING in the future.
For example:
- It may throw an exception (something it should probably start doing ASAP to avoid future surprises).
- It may return always-false, even when the two references are "to the same instance" (and probably will, through many possible value-based compiler optimizations that will erase the unneeded notion of reference and identity).
- It may overwrite random locations in memory or to variables that the code performing the operation has the privilege to write to (which it probably shouldn't, but that's certainly included in what "undefined" and "unpredictable effects" can mean).
- It may sometimes do one of the above, and sometimes seem to be doing what you mean it to do. Switching between modes on a whim (e.g. when a tier 2 optimizing compilation is applied, or when the mutton is nice and lean and the tomato is ripe).

So no, there is no way for compareAndSet to work "correctly" on a reference to an instance of a value-based class. Even if it happens to appear to work "correctly" now, expect it to blow up in bad and potentially silent ways in the future.

— Gil.

> On Jul 5, 2017, at 9:47 PM, Brian S O'Neill <bronee at gmail.com<mailto:bronee at gmail.com>> wrote:
>
> I think the wording in the value-based document is too strong. It's perfectly fine to compare value based instances using ==, but it can lead to confusing results when comparing distinct instances with equivalent state. Using compareAndSet with a box isn't necessary for it to work "correctly" with a value-based class.
>
> By "correctly", I mean the compareAndSet operation works correctly, using == comparison. However, if your intention is for compareAndSet to compare Instants based on their state, then this of course won't work properly.
>
> If you want to perform a compareAndSet for an Instant's state (time since epoch), then you need to use something that can be compared atomically. This means the state must be representable in a 64-bit value or smaller. The Instant class measures time using a 64-bit long and a 32-bit int, and so this state cannot be compared atomically. You'd have to chop off some precision or use something else.
>
>
> On 2017-07-05 09:20 PM, Gil Tene wrote:
>> Reference equality for value based classes (as referenced below) lacks meaning, as there is no notion of identity in such classes (only a notion of value). And since compareAndSet on reference fields is basically an idenitity-based operation [in the compare part], the two won't mix well logically.
>> Specifically, while two references to e.g. java.time.LocalDateTime instances being == to each other *probably* means that the two are actually equal in value, the opposite is not true: Being != to each other does NOT mean that they are logically different. As such, the "compare" part in compareAndSet may falsely fail even when the two instances are logically equal to each other, leaving the rest of your logic potentially exposed.
>> Bottom line: given the explicit warning to not use == and != on references to value-based instances, I'd avoid using compareAndSet on those references. If you really need to use a value-based class in your logic, consider boxing it in another object that has [normal] identity.
>> — Gil.
>>> On Jul 5, 2017, at 8:59 PM, Michael Hixson <michael.hixson at gmail.com<mailto:michael.hixson at gmail.com>> wrote:
>>>
>>> AtomicReference and VarHandle are specified to use == in compareAndSet
>>> (and related) operations [1].  Using == to compare instances of
>>> value-based classes may lead to "unpredictable results" [2].  Does
>>> this mean I should avoid using compareAndSet with arguments that are
>>> instances of value-based classes?
>>>
>>> It seems like the documentation clearly tells me "yes, avoid doing
>>> that" but I'm hoping I misunderstood, or maybe AtomicReference and
>>> VarHandle are exempt somehow.  Otherwise, how do I implement
>>> non-broken compareAndSet and updateAndGet for a java.time.Instant
>>> value for example?  Do I have to box the value in something that's not
>>> a value-based class first, like AtomicReference<Box<Instant>>?
>>>
>>> -Michael
>>>
>>> [1] http://download.java.net/java/jdk9/docs/api/java/util/concurrent/atomic/AtomicReference.html#compareAndSet-V-V-
>>> [2] http://download.java.net/java/jdk9/docs/api/java/lang/doc-files/ValueBased.html
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/listinfo/concurrency-interest



Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Karl-Heinz Streibich (Vorsitzender/Chairman), Eric Duffaut, Dr. Wolfram Jost, Arnd Zinnhardt, Dr. Stefan Sigg; - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Dr. Andreas Bereczky - http://www.softwareag.com<http://www.softwareag.com/>


_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20170706/b6de7ed8/attachment-0001.html>


More information about the Concurrency-interest mailing list