[concurrency-interest] synchronized vs Unsafe#monitorEnter/monitorExit
gil at azulsystems.com
Sat Dec 27 20:10:01 EST 2014
It's not "synchronized" per se that is responsible for the difference. It's the use of the monitorenter and monitorexit bytecodes. Some of the optimizations done for monitors rely on their verified behavior for correctness. The unsafe versions are not verified to adhere to the same requirements, which either makes some optimizations impossible, or just made the optimization designer not bother trying to optimize the unconfined "could do anything" case.
E.g. the fast, uncontended, unbiased monitor path devolves to fast path CAS on the object header in most JVMs (displaced headers, thin locking, Bacon bits, whatever...). But this common optimization often strongly assumes balanced use of monitors as enforced by the verifier when monitor_enter and monitor_exit byetcodes are used. E.g. HotSpot uses displaced headers for this operation, and stores a displaced mark word on the thread stack, knowing (based on the verified bytecode qualities) that the stack frame will not be rewound before a monitor_exit would occur. Since an unsafe monitor enter call may not have a matching monitor exit in the same frame, that optimization would be invalid to perform.
On Dec 27, 2014, at 12:31 PM, Ben Manes <ben_manes at yahoo.com<mailto:ben_manes at yahoo.com>> wrote:
Can someone explain why using Unsafe's monitor methods are substantially worse than synchronized? I had expected them to emit equivalent monitorEnter/monitorExit instructions and have similar performance.
My use case is to support a bulk version of CHM#computeIfAbsent, where a single mapping function returns the result for computing multiple entries. I had hoped to bulk lock, insert the unfilled entries, compute, populate, and bulk unlock. An overlapping write would be blocked due to requiring an entry's lock for mutation. I had thought that using Unsafe would allow for achieving this without the memory overhead of a ReentrantLock/AQS per entry, since the synchronized keyword is not flexible enough to provide this structure.
Benchmark Mode Samples Score Error Units
c.g.b.c.SynchronizedBenchmark.monitor_contention thrpt 10 3694951.630 ± 34340.707 ops/s
c.g.b.c.SynchronizedBenchmark.monitor_noContention thrpt 10 8274097.911 ± 164356.363 ops/s
c.g.b.c.SynchronizedBenchmark.reentrantLock_contention thrpt 10 31668532.247 ± 740850.955 ops/s
c.g.b.c.SynchronizedBenchmark.reentrantLock_noContention thrpt 10 41380163.703 ± 2270103.507 ops/s
c.g.b.c.SynchronizedBenchmark.synchronized_contention thrpt 10 22905995.761 ± 117868.968 ops/s
c.g.b.c.SynchronizedBenchmark.synchronized_noContention thrpt 10 44891601.915 ± 1458775.665 ops/s
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Concurrency-interest