[concurrency-interest] ConcurrentHashMap computeIfAbsent

Peter Levart peter.levart at gmail.com
Sun Dec 21 17:48:08 EST 2014


On 12/21/2014 10:20 PM, Benjamin Manes wrote:
> Thanks Peter,
>
> I don't think compute() changes my suggestion. If the bucket count is 
> larger than one, then I was suggesting inserting a regular Node with a 
> null value that is updated when the mapping function completes 
> successfully.

You mean "a Node that is *removed*" when computation ends. Yes, as I 
said, that is a possibility. A "marker" node inserted just after 1st 
node (or before 1st node?) in bucket and later removed. This would mean 
three additional volatile writes though (next field is volatile). Not to 
mention what to do when nodes are forming a tree (yes, that's possible).

> The node is scavenged when the computation fails, but since we know 
> the instance that extra scan should be cheap. By updating a regular 
> node I had hoped one could avoid garbage in the common case.

If common case was inserting, then perhaps, yes. But currently the 
skipping detection is designed by checking the hash field which is final.


>
> Experimenting to determine the feasibility of this type of change is 
> too invasive for me to feel comfortable trying given the interactions 
> with all the other map operations that I may not fully consider at the 
> moment. Since ReservationNode is a special type, it seems likely that 
> there are scenarios that I'm not considering.

A change using "marker" nodes would be very invasive. Each operation 
itself traverses the node chain or tree and either inserts, modifies or 
removes a node (possibly the 1st node of the bucket). Adding insertion 
of "marker" node and later removal would complicate the 
traversal/insertion/removal logic of each individual operation with user 
call-backs in it's own unique way. I don't know if such change is 
warranted just to get better diagnostics.

Regards, Peter

>
> On Sun, Dec 21, 2014 at 12:53 PM, Peter Levart <peter.levart at gmail.com 
> <mailto:peter.levart at gmail.com>> wrote:
>
>     Hi Benjamin,
>
>     Doug is certainly the one to give you the best explanation, but
>     let me try...
>
>     On 12/21/2014 08:36 PM, Benjamin Manes wrote:
>>
>>         This is worth considering, but it only detects one form of
>>         misbehavior when the bin doesn't hold any other nodes.
>>         Otherwise, the first node need not be a ReservationNode.
>>
>>     I'm not comfortable enough with the code to experiment, but
>>     reading it I keep wondering if inserting a node prior to
>>     computation would provide detection. The value would be null, to
>>     be filled in if the computation succeeds. If a reentrant
>>     computation occurs then when the existing node is found and we
>>     observe that the value is unset, then we have detected a cycle
>>     since we already hold the necessary lock. This would require
>>     skipping those nodes for methods that traverse the entire map,
>>     like iterators and size().
>
>     ...ReservationNode is such node. But is only inserted (with CAS)
>     as 1st node of the bucket when the bucket is still empty. This is
>     to provide the object to use as the lock when changing anything in
>     the bucket. It is replaced with normal Node when the function's
>     result is obtained. Logically it is skipped on look-ups and
>     traversals. If a node is inserted into the bucket that already
>     contains nodes, no ReservationNode is used.
>
>     And it's not only inserting (computeIfAbsent) that calls-back user
>     code. See also compute() method. This method overwrites the old
>     value with new value returned from function in existing node or
>     even removes a Node from the bucket (if function returns null).
>     How would you detect that such process is taking place when you
>     re-enter?
>
>     I thought about it for some time, but didn't find any existing
>     state combination that could be used for such checks. Perhaps a
>     ReservationNode could be inserted just after 1st node in bucket
>     temporarily just to mark the fact that we entered synchronization
>     block and later remove it before exiting the block. This would
>     most probably work, but would also mean that we produce one object
>     of garbage for each such call. Maybe this is better than 4 bytes
>     of state overhead for each node (see my experiment).
>
>>
>>     Internally, this would make computations feel more like the
>>     JCiP-style future based approach Viktor mentioned and allow for
>>     supporting batch computations. It has the nice property of not
>>     wasting more memory with values wrapped in futures, unless an
>>     asynchronous cache was explicitly requested. That can be built
>>     on-top of CHM, as was always possible, and an API option that I
>>     plan on providing.
>
>     Don't forget that CHM provides a lock-free lookup. We would not
>     like to loose this feature. Lazy evaluation (or waiting on
>     computation) can be added on top of CHM if one needs it.
>
>     Regards, Peter
>
>>
>>         The remaining mysterious cases are nested computeIfAbsent
>>         calls for different keys that happen to hash to the same bin.
>>         These conceptually ought to deadlock. Instead they will
>>         sometimes happen to "work" but other times loop.
>>
>>
>>     If changes like the above removed the live lock issue, then I'd
>>     be much happier with a resulting deadlock remain unsolved. That's
>>     much easier for developers to notice with jdk tooling when
>>     debugging server issues and recognize the problem as their fault.
>>
>>     On Sun, Dec 21, 2014 at 10:07 AM, Viktor Klang
>>     <viktor.klang at gmail.com <mailto:viktor.klang at gmail.com>> wrote:
>>
>>
>>
>>         On Sun, Dec 21, 2014 at 4:47 PM, Doug Lea <dl at cs.oswego.edu
>>         <mailto:dl at cs.oswego.edu>> wrote:
>>
>>             On 12/21/2014 09:59 AM, Viktor Klang wrote:
>>
>>                 For "computeIfAbsent"-style maps which are intended
>>                 for multithreaded use, I
>>                 tend to prefer to use ConcurrentMap[Key,
>>                 CompletionStage[Value]],
>>
>>
>>             Right. Prior to jdk8, we suggested variations of this and
>>             other
>>             techniques that remain preferable in many cases. However, the
>>             most common CHM user error was trying (but failing) to
>>             emulate
>>             what is now supported as computeIfAbsent. This is an
>>             improvement,
>>             but leads to new (much less common) kinds of potential
>>             errors.
>>             We cannot automatically eliminate or trap all of them. It
>>             might be possible to do more than we do now, either
>>             internally
>>             (inside CHM) or, more likely, with improved developer tools.
>>
>>
>>         Absolutely, and the fact that there was only blocking Futures
>>         in the JDK at that time.
>>         Having "monadic" Futures like CompletableFuture means you can
>>         avoid blocking completely (assuming that the CHM impl is
>>         non-blocking). This is highly useful when using it as a
>>         cache, and especially if there's a very skewed read
>>         distribution, paired with a multitude of readers.
>>
>>
>>
>>             -Doug
>>
>>             _______________________________________________
>>             Concurrency-interest mailing list
>>             Concurrency-interest at cs.oswego.edu
>>             <mailto:Concurrency-interest at cs.oswego.edu>
>>             http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>>
>>         -- 
>>         Cheers,
>>>>
>>         _______________________________________________
>>         Concurrency-interest mailing list
>>         Concurrency-interest at cs.oswego.edu
>>         <mailto:Concurrency-interest at cs.oswego.edu>
>>         http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>>
>>     _______________________________________________
>>     Concurrency-interest mailing list
>>     Concurrency-interest at cs.oswego.edu  <mailto:Concurrency-interest at cs.oswego.edu>
>>     http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141221/d0d28e51/attachment-0001.html>


More information about the Concurrency-interest mailing list