[concurrency-interest] AtomicReference.updateAndGet() mandatory updating

Andrew Haley aph at redhat.com
Mon May 29 04:04:37 EDT 2017

On 28/05/17 02:27, Gregg Wonderly wrote:
>> On May 26, 2017, at 10:05 AM, Andrew Haley <aph at redhat.com> wrote:
>> On 26/05/17 14:56, Doug Lea wrote:
>>> On 05/26/2017 09:35 AM, Andrew Haley wrote:
>>>> On 26/05/17 13:56, Andrew Dinn wrote:
>>>>>> Initially (in Java5) requiring it has led to some questionable reliance.
>>>>>> So we cannot change it. But there's not much motivation to do so anyway:
>>>>>> As implied by Nathan Reynolds, encountering some (local) fence overhead
>>>>>> on CAS failure typically reduces contention and may improve throughput.
>>>>> It would be useful to know if that reduction in contention is specific
>>>>> to, say, x86 hardware or also occurs on weak memory architectures like
>>>>> AArch64 or ppc. Perhaps Nathan could clarify that?
>>> The main issues are not tightly bound to architecture.
>>> In the vast majority of cases, the response to CAS failure is
>>> some sort of retry (although perhaps with some intermediate
>>> processing). The fence here plays a similar role to
>>> Thread.onSpinWait. And in fact, on ARM, is likely to be
>>> exactly the same implementation as onSpinWait.
>> onSpinWait is null, and unless ARM does something to the architecture
>> that's probably what it'll remain.
>>> As Alex mentioned, in the uncommon cases where this
>>> is a performance issue, people can use one of the weak CAS
>>> variants.
>>>> Just thinking about AArch64, and how to implement such a thing as well
>>>> as possible. 
>>> "As well as possible" may be just to unconditionally issue fence,
>>> at least for plain CAS; maybe differently for the variants.
>> I doubt that: I've done some measurements, and it always pays to branch
>> conditionally around a fence if it's not needed.
> Since the fence is part of the happens before controls that
> developers encounter, how can a library routine know what the
> developer needs, to know how to “randomly” optimize with a branch
> around the fence?  Are you aware of no software that exists where
> developers are actively counting MM interactions trying to minimize
> them?  Here you are trying to do it yourself because you “See” an
> optimization that is so localized, away from any explicit code
> intent, that you can’t tell ahead of time (during development of
> your optimization), what other developers have actually done around
> the fact that this fence was unconditional before right?
> Help me understand how you know that no software that works
> correctly now, will start working randomly, incorrectly, because
> sometimes the fence never happens.

It's in the specification.  If a fence is required by the
specification, we must execute one. If not, the question is whether
it's faster to execute a fence unconditionally or to branch around it.

Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

More information about the Concurrency-interest mailing list