[concurrency-interest] synchronized vs Unsafe#monitorEnter/monitorExit

Gil Tene gil at azulsystems.com
Sat Dec 27 20:10:01 EST 2014


It's not "synchronized" per se that is responsible for the difference. It's the use of the monitorenter and monitorexit bytecodes. Some of the optimizations done for monitors rely on their verified behavior for correctness. The unsafe versions are not verified to adhere to the same requirements, which either makes some optimizations impossible, or just made the optimization designer not bother trying to optimize the unconfined "could do anything" case.

E.g. the fast, uncontended, unbiased monitor path devolves to fast path CAS on the object header in most JVMs (displaced headers, thin locking, Bacon bits, whatever...). But this common optimization often strongly assumes balanced use of monitors as enforced by the verifier when monitor_enter and monitor_exit byetcodes are used. E.g. HotSpot uses displaced headers for this operation, and stores a displaced mark word on the thread stack, knowing (based on the verified bytecode qualities) that the stack frame will not be rewound before a monitor_exit would occur. Since an unsafe monitor enter call may not have a matching monitor exit in the same frame, that optimization would be invalid to perform.

— Gil.

On Dec 27, 2014, at 12:31 PM, Ben Manes <ben_manes at yahoo.com<mailto:ben_manes at yahoo.com>> wrote:

Can someone explain why using Unsafe's monitor methods are substantially worse than synchronized? I had expected them to emit equivalent monitorEnter/monitorExit instructions and have similar performance.

My use case is to support a bulk version of CHM#computeIfAbsent, where a single mapping function returns the result for computing multiple entries. I had hoped to bulk lock, insert the unfilled entries, compute, populate, and bulk unlock. An overlapping write would be blocked due to requiring an entry's lock for mutation. I had thought that using Unsafe would allow for achieving this without the memory overhead of a ReentrantLock/AQS per entry, since the synchronized keyword is not flexible enough to provide this structure.

Thanks,
Ben

Benchmark                                                    Mode  Samples         Score         Error  Units
c.g.b.c.SynchronizedBenchmark.monitor_contention            thrpt       10   3694951.630 ±   34340.707  ops/s
c.g.b.c.SynchronizedBenchmark.monitor_noContention          thrpt       10   8274097.911 ±  164356.363  ops/s
c.g.b.c.SynchronizedBenchmark.reentrantLock_contention      thrpt       10  31668532.247 ±  740850.955  ops/s
c.g.b.c.SynchronizedBenchmark.reentrantLock_noContention    thrpt       10  41380163.703 ± 2270103.507  ops/s
c.g.b.c.SynchronizedBenchmark.synchronized_contention       thrpt       10  22905995.761 ±  117868.968  ops/s
c.g.b.c.SynchronizedBenchmark.synchronized_noContention     thrpt       10  44891601.915 ± 1458775.665  ops/s



_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141228/3a45b814/attachment-0001.html>


More information about the Concurrency-interest mailing list