[concurrency-interest] ConcurrentHashMap computeIfAbsent

Doug Lea dl at cs.oswego.edu
Sun Dec 21 09:20:34 EST 2014


On 12/20/2014 04:11 PM, Benjamin Manes wrote:
>     If an efficient internal tryLock intrinsic became available, we might do
>     more, because throwing here helps diagnose unrelated-looking problems in
>     other threads, and performing checks only when lock is unavailable costs
>     almost nothing.
>
>
> Is the tryLock intrinsic necessary to be performant and safe? I tried using
> Thread.holdsLock() and it halved the performance. I then tried stashing the
> thread's id into an extra field only stored on ReservationNode and saw equal
> performance on my 4-core/8HT macbook.

This is worth considering, but it only detects one form of
misbehavior when the bin doesn't hold any other nodes.
Otherwise, the first node need not be a ReservationNode. Also,
it doesn't alone deal with interactions with other
lambda-accepting methods.

Here is some background context in case others have any
thoughts about improving support:

By popular demand, CHM strengthens ConcurrentMap.computeIfAbsent
specs (similarly for computeIfPresent etc) to guarantee atomicity
while the function is executed. This requires that we hold a lock
(for the bin in which the entry will be placed) during the call
to the supplied function. We tell users not to use functions
that have any other potential side effects on the map,
but we cannot enforce it. Holding a lock adds to the possible
bad outcomes of violating the side-effect-freedom requirement.
We'd like to be helpful in diagnosing violations, because
these outcomes can be very mysterious-looking.

CHM itself cannot directly help with some potential
lock-based errors. For example, if the function for
computeIfAbsent(key1) can invoke computeIfAbsent(key2)
and vice versa, and the keys hash to different bins, then they
can encounter a classic deadlock. But this and related cases
can be diagnosed using existing JVM cycle-detection tools.

The nature of other potential lock-based errors depends
on the kind of lock used. Some preliminary versions of
jdk8 CHM used non-reentrant locks. (It now uses reentrant
locks.)

With non-reentrant locks, cases of infinite recursion
(i.e., m.computeIfAbsent(k,f) somehow calling itself) result in
lockups rather than infinite loops. Similarly for mutually
recursive computeIfAbsent calls for key1 and key2 that happen
to hash into the same bin. However these cases can be detected
with very low performance impact (by doing so only if a tryLock
fails). Which led us to add CHM method throws specs to allow
this. Even though jdk8 CHM now uses reentrant locks, we want to
keep open the ability to use potentially faster non-reentrant
approaches in the future.

With reentrant locks, infinite computeIfAbsent recursions
on the same key are not all that different than any other
case of infinite recursion. This is "detectable" only by
throwing after the number of calls exceeds a threshold.
But the threshold can arguably be set to 1 because of the
no-side-effects rule, and one special case of this check
(as above, via thread IDs in ReservationNodes) can be done
with only small impact. But it's not clear to me whether
this is worthwhile by itself.

The remaining mysterious cases are nested computeIfAbsent
calls for different keys that happen to hash to the same bin.
These conceptually ought to deadlock. Instead they will
sometimes happen to "work" but other times loop.
(This appears to be the underlying undetected user bug in
JDK-8062841.) It seems that this is beyond our current ability
to diagnose, but suggestions would be welcome.

-Doug



More information about the Concurrency-interest mailing list