[concurrency-interest] synchronized vs Unsafe#monitorEnter/monitorExit

Ben Manes ben_manes at yahoo.com
Tue Dec 30 20:04:53 EST 2014

Sure, but here we're talking about a range more commonly in the 10s to low 1,000s. The larger the batch size means the more expensive the batch query is, so concurrent lookups might take an unacceptable delay waiting for a large batch to complete. At that point its a game of tuning the application for excessively large batch scenarios. 

     On Tuesday, December 30, 2014 4:29 PM, Oleksandr Otenko <oleksandr.otenko at oracle.com> wrote:

  ok. You must consider that using the object header means different JVMs will have different overheads and limits. Some JVMs will not let you have tens of millions of objects on which you want to synchronize - contended or not - just because there is not enough space in their object headers for all the book-keeping info.
 On 31/12/2014 00:12, Ben Manes wrote:
  > Why do you need synchronized specifically? (and not j.u.c.locks.Lock lock/unlock)
  To minimize memory usage, as the lock is per entry. Consider this as an extension to ConcurrentHashMap's computeIfAbsent(K key), but that instead accepts Iterable<K> keys. A full j.u.c lock is ideal, but that requires each entry embeds a lock instance and that adds significant bloat. In the case of a cache, writes are rare and even rarer to the same entries. Thus, synchronizing should use the Object's header as a think lock until contention is detected and the JVM inflates the lock to a full mutex. After inflation, the JVM could decide to deinflate the lock if it reverts back to low contention usages. That means no wasted memory to support a single use-case, which may not be a very commonly used pattern. If the Unsafe versions were optimizable, then using synchronized would be an ideal choice. 
  There are alternatives, of course. The approach in Guava was to be non-blocking and populate the cache after the bulk load completes. This overwrites any entries that appeared while the loadAll was computed, so it doesn't solve the thundering herd problem. This is reasonable based on the assumption that local cache misses don't cause the as much pain as, say, a memcached restart causing a database query storm. Another approach might use lock striping, e.g. 1024 real locks that can be hashed to and are acquired in ascending order. That adds more complexity and hash collisions might  occur, but it wouldn't be too bad in practice. It does add extra complexity. And yet another might be a lazy lock field per entry, but that is still wasteful in large caches. 
  There's probably a solution using some type of reservation entry that contains a lock reference, and after populated transitions to an implementation that doesn't have a lock. That can be done within the hash table, like CHM's ReservationNode, but is harder to do elegantly through a decorator. 
  Anyway, hopefully its clear now why Unsafe appeared like an attractive solution if it had performed well. 
       On Tuesday, December 30, 2014 6:54 AM, Oleksandr Otenko <oleksandr.otenko at oracle.com> wrote:
    Why do you need synchronized specifically? (and not j.u.c.locks.Lock lock/unlock)
  On 28/12/2014 20:15, Ben Manes wrote:
  That's an nifty workaround but, as you said, not reliable in the case of a general api. This case is a multi-get for a cache library, so the keys and use-cases aren't known. 
  When adding bulk loading to Guava, due to the internal complexity we allowed it to racy by not blocking concurrent calls for shared entries. This has to be done regardless because a loadAll(keys) may return a  Map<K, V> with more entries than originally requested, and those extra entries should be cached. These cases were  trivial to handle when I had previously written a Map<K, Future<V>> to support multi-get, but a lot of bloat per entry. I had hoped Unsafe#monitorEnter would be a nice alternative and, if the API was stable, it arguably still could be due to the low write rate of caches. For now I'll  leave it in a backlog of performance optimization ideas like we did with Guava, maybe revisiting once the rewrite stabilizes. 
  Thanks, -Ben 
       On Sunday, December 28, 2014 5:18 AM, Peter Levart <peter.levart at gmail.com> wrote:
 On 12/27/2014 09:31 PM, Ben Manes wrote:
  Can someone explain why using Unsafe's monitor methods are  substantially worse than synchronized? I had expected them to emit equivalent  monitorEnter/monitorExit instructions and have similar performance. 
  My use case is to support a bulk version of CHM#computeIfAbsent, where a single mapping function returns the result for  computing multiple entries. I had hoped to bulk lock, insert the unfilled entries, compute, populate, and bulk unlock. An overlapping write  would be blocked due to requiring an entry's lock for mutation.   
 Hi Ben,
 If "multiple entries" means less than say 50 or even a little more  (using more locks than that for one bulk operation is not very optimal anyway), you could try  constructing a recursive method to bulk-lock on the list of objects and execute a provided function:
       public static <T> T synchronizedAll(Iterable<?> locks, Supplier<T> supplier) {
         return synchronizedAll(locks.iterator(), supplier);
     private static <T> T synchronizedAll(Iterator<?> locksIterator, Supplier<T> supplier) {
         if (locksIterator.hasNext()) {
             synchronized (locksIterator.next()) {
                 return synchronizedAll(locksIterator, supplier);
         } else {
             return supplier.get();
 Note that to avoid deadlocks, the locks list has to be sorted using  a Comparator that is consistent with your key's equals() method...
 Regards, Peter 
     I had thought that using Unsafe would allow for  achieving this without the memory overhead of a ReentrantLock/AQS per  entry, since the synchronized keyword is not flexible enough to provide this structure. 
  Thanks, Ben 
  Benchmark                                                    Mode  Samples         Score         Error  Units c.g.b.c.SynchronizedBenchmark.monitor_contention            thrpt       10   3694951.630 ±   34340.707  ops/s c.g.b.c.SynchronizedBenchmark.monitor_noContention          thrpt       10   8274097.911 ±  164356.363  ops/s c.g.b.c.SynchronizedBenchmark.reentrantLock_contention      thrpt       10  31668532.247 ±  740850.955  ops/s c.g.b.c.SynchronizedBenchmark.reentrantLock_noContention    thrpt       10  41380163.703 ± 2270103.507  ops/s c.g.b.c.SynchronizedBenchmark.synchronized_contention       thrpt       10  22905995.761 ±  117868.968  ops/s c.g.b.c.SynchronizedBenchmark.synchronized_noContention     thrpt       10  44891601.915 ± 1458775.665  ops/s  
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141231/642ed875/attachment-0001.html>

More information about the Concurrency-interest mailing list