[concurrency-interest] VarHandle.setVolatile vs classical volatile write

Dávid Karnok akarnokd at gmail.com
Fri Aug 18 16:58:52 EDT 2017

Thanks. I did a benchmark (
https://gist.github.com/akarnokd/c0d606bd7e29d143ee82f2026898dbb5) and got
the following results:

i5 6440HQ, Windows 10 x64, Java 9b181, JMH 1.19

Benchmark                       Mode  Cnt          Score         Error
VolatilePerf.getAndAdd         thrpt    5  117841308,999 ± 3940711,142
VolatilePerf.getAndSet         thrpt    5  118162019,136 ± 1349823,016
VolatilePerf.releaseGetAndAdd  thrpt    5  118688354,409 ±  642044,969
VolatilePerf.setRelease        thrpt    5  890890009,555 ± 4323041,380
VolatilePerf.setVolatile       thrpt    5  118419990,949 ±  793885,407

Being on Windows and on a Laptop usually yields some variance, but looks
like there is practically minimal difference between the full barrier

Btw, thinking about XCHG and XADD, they have to provide the same strong
volatile read and write as they both read and write something atomically. I
would have thought XADD involving some ALU is detectably more costly but a
3 cycle addition is relatively small compared to a 22-45 cycle cache action.

2017-08-18 21:21 GMT+02:00 Paul Sandoz <paul.sandoz at oracle.com>:

> On 18 Aug 2017, at 11:49, Dávid Karnok <akarnokd at gmail.com> wrote:
> Hi,
> in an older blog post (https://shipilev.net/blog/2014/on-the-fence-with-
> dependencies/#_storeload_barrier_and_stack_usages) about write barriers,
> it is mentioned the JIT uses a stack local address and XADD to flush the
> write buffer when a volatile field is written on x86 and also mentions the
> option to use XCHG instead, targeting the actual memory location.
> My question is, does a compiled VarHandle.setVolatile do the same XADD
> trick or is it using XCHG?
> It uses the same trick, since the VarHandles implementation in OpenJDK
> tunnels through to Unsafe with surrounding safety checks that the compiler
> folds away when it knows it’s safe to do so.
> Has there been a newer performance evaluation with XCHG since the blog
> post?
> Not that i am aware of.
> In other terms, is there a performance penalty/benefit in changing
> VarHandle.setVolatile() into VarHandle.getAndSet() when considering a
> modern x86 ?
> I suspect in general there may be a penalty since getAndSet provides
> stronger ordering (a volatile read and write), so i would hold off with any
> global search and replace of setVolatile with getAndSet :-)
> I would be interested in looking at performance results and generated
> assembly from some nano benchmarks.
> Paul.
> My particular use case is for running code designed for concurrency in
> non-concurrent fashion and perhaps saving the cost of a MOVE + XADD pair
> when an XCHG has the very same effect.
> Thank you for your time.
> --
> Best regards,
> David Karnok
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

Best regards,
David Karnok
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20170818/d7d579f5/attachment.html>

More information about the Concurrency-interest mailing list