[concurrency-interest] ConcurrentHashMap computeIfAbsent

Peter Levart peter.levart at gmail.com
Sun Dec 21 14:33:58 EST 2014


If we wanted to detect any re-entrance to the whole CHM by the same 
thread, then the state would have to be placed outside the lock object 
(outside the head of the bucket). Conceptually it would have to be a 
ThreadLocal<Boolean> instance (per CHM instance).

As I understand, the problematic re-entrances are those that re-enter 
synchronized blocks. All other code (out of synchronized blocks) is safe 
to reenter, since it is safe to execute by multiple threads. Re-entering 
synchronized block causes most hard to diagnose effects (what's left are 
only deadlocks). If locks were non-reentrant and threw 
IllegalStateException on trying to re-enter, it would be fine.

One way to get such locks is to check holdsLock() before locking:

if (Thread.holdsLock(lock)) throw new IllegalStateException();
synchronized (lock) {

... but measuring on my i7 Linux PC shows that holdsLock() amounts to an 
addition of 2/3 the overhead of non-contended locking.

If we added an int field to Node class, we would extend it for 4 bytes 
on 32 bit JVM, but on 64 bit JVM it would be packed together with hash 
field and Node footprint would not increase. Current sizes of Nodes on 
32 bit JVM are: (Node/ReservationNode: 28 bytes, TreeNode: 48 bytes, 
TreeBin: 44 bytes, ForwardingNode: 32 bytes). Given that plain Nodes are 
in majority, this would amount in approx. 14% increase of heap usage for 
nodes on 32 bit JVMs, but no increase on 64 bit JVMs.

If this is a no-go, then I'm out of ideas short of checking whether 
holdsLock() could be made any faster. Perhaps it just needs to be 
intrinsified (if it's not already)? If Node increase is acceptable then 
read further...

Since thread id is a long value, packing it into an int would give us 
just a kind of hash. But good-enough hash so that additional checking 
for holdsLock() would be needed from purely theoretical perspective.

Here's to illustrate what I mean:


Changes I did are purely mechanical. I inserted node.checkNotEntered() 
at start of each synchronized block that doesn't contain a call-back to 
user code:

synchronized (node) {
     // body that doesn't call-back user code

... and wrapped the body of synchronized blocks that call-back user code 

synchronized (node) {
     try {
         // body that calls-back user code
     } finally {

What do you think? Is space/diagnosability trade-of worth it?

Regards, Peter

On 12/21/2014 03:20 PM, Doug Lea wrote:
> On 12/20/2014 04:11 PM, Benjamin Manes wrote:
>> If an efficient internal tryLock intrinsic became available, we might do
>>     more, because throwing here helps diagnose unrelated-looking 
>> problems in
>>     other threads, and performing checks only when lock is 
>> unavailable costs
>>     almost nothing.
>> Is the tryLock intrinsic necessary to be performant and safe? I tried 
>> using
>> Thread.holdsLock() and it halved the performance. I then tried 
>> stashing the
>> thread's id into an extra field only stored on ReservationNode and 
>> saw equal
>> performance on my 4-core/8HT macbook.
> This is worth considering, but it only detects one form of
> misbehavior when the bin doesn't hold any other nodes.
> Otherwise, the first node need not be a ReservationNode. Also,
> it doesn't alone deal with interactions with other
> lambda-accepting methods.
> Here is some background context in case others have any
> thoughts about improving support:
> By popular demand, CHM strengthens ConcurrentMap.computeIfAbsent
> specs (similarly for computeIfPresent etc) to guarantee atomicity
> while the function is executed. This requires that we hold a lock
> (for the bin in which the entry will be placed) during the call
> to the supplied function. We tell users not to use functions
> that have any other potential side effects on the map,
> but we cannot enforce it. Holding a lock adds to the possible
> bad outcomes of violating the side-effect-freedom requirement.
> We'd like to be helpful in diagnosing violations, because
> these outcomes can be very mysterious-looking.
> CHM itself cannot directly help with some potential
> lock-based errors. For example, if the function for
> computeIfAbsent(key1) can invoke computeIfAbsent(key2)
> and vice versa, and the keys hash to different bins, then they
> can encounter a classic deadlock. But this and related cases
> can be diagnosed using existing JVM cycle-detection tools.
> The nature of other potential lock-based errors depends
> on the kind of lock used. Some preliminary versions of
> jdk8 CHM used non-reentrant locks. (It now uses reentrant
> locks.)
> With non-reentrant locks, cases of infinite recursion
> (i.e., m.computeIfAbsent(k,f) somehow calling itself) result in
> lockups rather than infinite loops. Similarly for mutually
> recursive computeIfAbsent calls for key1 and key2 that happen
> to hash into the same bin. However these cases can be detected
> with very low performance impact (by doing so only if a tryLock
> fails). Which led us to add CHM method throws specs to allow
> this. Even though jdk8 CHM now uses reentrant locks, we want to
> keep open the ability to use potentially faster non-reentrant
> approaches in the future.
> With reentrant locks, infinite computeIfAbsent recursions
> on the same key are not all that different than any other
> case of infinite recursion. This is "detectable" only by
> throwing after the number of calls exceeds a threshold.
> But the threshold can arguably be set to 1 because of the
> no-side-effects rule, and one special case of this check
> (as above, via thread IDs in ReservationNodes) can be done
> with only small impact. But it's not clear to me whether
> this is worthwhile by itself.
> The remaining mysterious cases are nested computeIfAbsent
> calls for different keys that happen to hash to the same bin.
> These conceptually ought to deadlock. Instead they will
> sometimes happen to "work" but other times loop.
> (This appears to be the underlying undetected user bug in
> JDK-8062841.) It seems that this is beyond our current ability
> to diagnose, but suggestions would be welcome.
> -Doug
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141221/552d5d79/attachment.html>

More information about the Concurrency-interest mailing list