[concurrency-interest] synchronized vs Unsafe#monitorEnter/monitorExit

Oleksandr Otenko oleksandr.otenko at oracle.com
Tue Dec 30 19:29:11 EST 2014

ok. You must consider that using the object header means different JVMs 
will have different overheads and limits. Some JVMs will not let you 
have tens of millions of objects on which you want to synchronize - 
contended or not - just because there is not enough space in their 
object headers for all the book-keeping info.


On 31/12/2014 00:12, Ben Manes wrote:
> > Why do you need synchronized specifically? (and not j.u.c.locks.Lock 
> lock/unlock)
> To minimize memory usage, as the lock is per entry. Consider this as 
> an extension to ConcurrentHashMap's computeIfAbsent(K key), but that 
> instead accepts Iterable<K> keys. A full j.u.c lock is ideal, but that 
> requires each entry embeds a lock instance and that adds significant 
> bloat. In the case of a cache, writes are rare and even rarer to the 
> same entries. Thus, synchronizing should use the Object's header as a 
> think lock until contention is detected and the JVM inflates the lock 
> to a full mutex. After inflation, the JVM could decide to deinflate 
> the lock if it reverts back to low contention usages. That means no 
> wasted memory to support a single use-case, which may not be a very 
> commonly used pattern. If the Unsafe versions were optimizable, then 
> using synchronized would be an ideal choice.
> There are alternatives, of course. The approach in Guava was to be 
> non-blocking and populate the cache after the bulk load completes. 
> This overwrites any entries that appeared while the loadAll was 
> computed, so it doesn't solve the thundering herd problem. This is 
> reasonable based on the assumption that local cache misses don't cause 
> the as much pain as, say, a memcached restart causing a database query 
> storm. Another approach might use lock striping, e.g. 1024 real locks 
> that can be hashed to and are acquired in ascending order. That adds 
> more complexity and hash collisions might occur, but it wouldn't be 
> too bad in practice. It does add extra complexity. And yet another 
> might be a lazy lock field per entry, but that is still wasteful in 
> large caches.
> There's probably a solution using some type of reservation entry that 
> contains a lock reference, and after populated transitions to an 
> implementation that doesn't have a lock. That can be done within the 
> hash table, like CHM's ReservationNode, but is harder to do elegantly 
> through a decorator.
> Anyway, hopefully its clear now why Unsafe appeared like an attractive 
> solution if it had performed well.
> On Tuesday, December 30, 2014 6:54 AM, Oleksandr Otenko 
> <oleksandr.otenko at oracle.com> wrote:
> Why do you need synchronized specifically? (and not j.u.c.locks.Lock 
> lock/unlock)
> Alex
> On 28/12/2014 20:15, Ben Manes wrote:
>> That's an nifty workaround but, as you said, not reliable in the case 
>> of a general api. This case is a multi-get for a cache library, so 
>> the keys and use-cases aren't known.
>> When adding bulk loading to Guava, due to the internal complexity we 
>> allowed it to racy by not blocking concurrent calls for shared 
>> entries. This has to be done regardless because a loadAll(keys) may 
>> return a Map<K, V> with more entries than originally requested, and 
>> those extra entries should be cached. These cases were trivial to 
>> handle when I had previously written a Map<K, Future<V>> to support 
>> multi-get, but a lot of bloat per entry. I had hoped 
>> Unsafe#monitorEnter would be a nice alternative and, if the API was 
>> stable, it arguably still could be due to the low write rate of 
>> caches. For now I'll leave it in a backlog of performance 
>> optimization ideas like we did with Guava, maybe revisiting once the 
>> rewrite stabilizes.
>> Thanks,
>> -Ben
>> On Sunday, December 28, 2014 5:18 AM, Peter Levart 
>> <peter.levart at gmail.com> <mailto:peter.levart at gmail.com> wrote:
>> On 12/27/2014 09:31 PM, Ben Manes wrote:
>>> Can someone explain why using Unsafe's monitor methods are 
>>> substantially worse than synchronized? I had expected them to emit 
>>> equivalent monitorEnter/monitorExit instructions and have similar 
>>> performance.
>>> My use case is to support a bulk version of CHM#computeIfAbsent, 
>>> where a single mapping function returns the result for computing 
>>> multiple entries. I had hoped to bulk lock, insert the unfilled 
>>> entries, compute, populate, and bulk unlock. An overlapping write 
>>> would be blocked due to requiring an entry's lock for mutation.
>> Hi Ben,
>> If "multiple entries" means less than say 50 or even a little more 
>> (using more locks than that for one bulk operation is not very 
>> optimal anyway), you could try constructing a recursive method to 
>> bulk-lock on the list of objects and execute a provided function:
>>     public static <T> T synchronizedAll(Iterable<?> locks, 
>> Supplier<T> supplier) {
>>         return synchronizedAll(locks.iterator(), supplier);
>>     }
>>     private static <T> T synchronizedAll(Iterator<?> locksIterator, 
>> Supplier<T> supplier) {
>>         if (locksIterator.hasNext()) {
>>             synchronized (locksIterator.next()) {
>>                 return synchronizedAll(locksIterator, supplier);
>>             }
>>         } else {
>>             return supplier.get();
>>         }
>>     }
>> Note that to avoid deadlocks, the locks list has to be sorted using a 
>> Comparator that is consistent with your key's equals() method...
>> Regards, Peter
>>> I had thought that using Unsafe would allow for achieving this 
>>> without the memory overhead of a ReentrantLock/AQS per entry, since 
>>> the synchronized keyword is not flexible enough to provide this 
>>> structure.
>>> Thanks,
>>> Ben
>>> Benchmark                  Mode  Samples         Score     Error  Units
>>> c.g.b.c.SynchronizedBenchmark.monitor_contention            thrpt   
>>>     10   3694951.630 ± 34340.707  ops/s
>>> c.g.b.c.SynchronizedBenchmark.monitor_noContention          thrpt   
>>>     10 8274097.911 ±  164356.363  ops/s
>>> c.g.b.c.SynchronizedBenchmark.reentrantLock_contention      thrpt   
>>>     10  31668532.247 ±  740850.955  ops/s
>>> c.g.b.c.SynchronizedBenchmark.reentrantLock_noContention    thrpt   
>>>     10  41380163.703 ± 2270103.507  ops/s
>>> c.g.b.c.SynchronizedBenchmark.synchronized_contention       thrpt   
>>>     10  22905995.761 ±  117868.968  ops/s
>>> c.g.b.c.SynchronizedBenchmark.synchronized_noContention     thrpt   
>>>     10  44891601.915 ± 1458775.665  ops/s
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu  <mailto:Concurrency-interest at cs.oswego.edu>
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu  <mailto:Concurrency-interest at cs.oswego.edu>
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141231/aaa6601a/attachment-0001.html>

More information about the Concurrency-interest mailing list