[concurrency-interest] ConcurrentHashMap computeIfAbsent

Benjamin Manes ben.manes at gmail.com
Sun Dec 21 18:02:37 EST 2014


>
> A "marker" node inserted just after 1st node (or before 1st node?) in
> bucket and later removed. This would mean three additional volatile writes
> though (next field is volatile). Not to mention what to do when nodes are
> forming a tree (yes, that's possible).


Yes, but since writes are done in a synchronized block, the extra writes
could piggyback their visibility on exiting that block.

But currently the skipping detection is designed by checking the hash field
> which is final.


Yes, and so this would incur an additional volatile read during writes only
if the hash and key checks passed, and a read per value node on full
traversal operations. The common case of lock free per-entry reads would be
unaffected.

On Sun, Dec 21, 2014 at 2:48 PM, Peter Levart <peter.levart at gmail.com>
wrote:

>
> On 12/21/2014 10:20 PM, Benjamin Manes wrote:
>
> Thanks Peter,
>
>  I don't think compute() changes my suggestion. If the bucket count is
> larger than one, then I was suggesting inserting a regular Node with a null
> value that is updated when the mapping function completes successfully.
>
>
> You mean "a Node that is *removed*" when computation ends. Yes, as I said,
> that is a possibility. A "marker" node inserted just after 1st node (or
> before 1st node?) in bucket and later removed. This would mean three
> additional volatile writes though (next field is volatile). Not to mention
> what to do when nodes are forming a tree (yes, that's possible).
>
>  The node is scavenged when the computation fails, but since we know the
> instance that extra scan should be cheap. By updating a regular node I had
> hoped one could avoid garbage in the common case.
>
>
> If common case was inserting, then perhaps, yes. But currently the
> skipping detection is designed by checking the hash field which is final.
>
>
>
>  Experimenting to determine the feasibility of this type of change is too
> invasive for me to feel comfortable trying given the interactions with all
> the other map operations that I may not fully consider at the moment. Since
> ReservationNode is a special type, it seems likely that there are scenarios
> that I'm not considering.
>
>
> A change using "marker" nodes would be very invasive. Each operation
> itself traverses the node chain or tree and either inserts, modifies or
> removes a node (possibly the 1st node of the bucket). Adding insertion of
> "marker" node and later removal would complicate the
> traversal/insertion/removal logic of each individual operation with user
> call-backs in it's own unique way. I don't know if such change is warranted
> just to get better diagnostics.
>
> Regards, Peter
>
>
> On Sun, Dec 21, 2014 at 12:53 PM, Peter Levart <peter.levart at gmail.com>
> wrote:
>
>>  Hi Benjamin,
>>
>> Doug is certainly the one to give you the best explanation, but let me
>> try...
>>
>> On 12/21/2014 08:36 PM, Benjamin Manes wrote:
>>
>>  This is worth considering, but it only detects one form of
>>> misbehavior when the bin doesn't hold any other nodes.
>>> Otherwise, the first node need not be a ReservationNode.
>>
>>
>> I'm not comfortable enough with the code to experiment, but reading it I
>> keep wondering if inserting a node prior to computation would provide
>> detection. The value would be null, to be filled in if the computation
>> succeeds. If a reentrant computation occurs then when the existing node is
>> found and we observe that the value is unset, then we have detected a cycle
>> since we already hold the necessary lock. This would require skipping those
>> nodes for methods that traverse the entire map, like iterators and size().
>>
>>
>> ...ReservationNode is such node. But is only inserted (with CAS) as 1st
>> node of the bucket when the bucket is still empty. This is to provide the
>> object to use as the lock when changing anything in the bucket. It is
>> replaced with normal Node when the function's result is obtained. Logically
>> it is skipped on look-ups and traversals. If a node is inserted into the
>> bucket that already contains nodes, no ReservationNode is used.
>>
>> And it's not only inserting (computeIfAbsent) that calls-back user code.
>> See also compute() method. This method overwrites the old value with new
>> value returned from function in existing node or even removes a Node from
>> the bucket (if function returns null). How would you detect that such
>> process is taking place when you re-enter?
>>
>> I thought about it for some time, but didn't find any existing state
>> combination that could be used for such checks. Perhaps a ReservationNode
>> could be inserted just after 1st node in bucket temporarily just to mark
>> the fact that we entered synchronization block and later remove it before
>> exiting the block. This would most probably work, but would also mean that
>> we produce one object of garbage for each such call. Maybe this is better
>> than 4 bytes of state overhead for each node (see my experiment).
>>
>>
>>  Internally, this would make computations feel more like the JCiP-style
>> future based approach Viktor mentioned and allow for supporting batch
>> computations. It has the nice property of not wasting more memory with
>> values wrapped in futures, unless an asynchronous cache was explicitly
>> requested. That can be built on-top of CHM, as was always possible, and an
>> API option that I plan on providing.
>>
>>
>> Don't forget that CHM provides a lock-free lookup. We would not like to
>> loose this feature. Lazy evaluation (or waiting on computation) can be
>> added on top of CHM if one needs it.
>>
>> Regards, Peter
>>
>>
>>  The remaining mysterious cases are nested computeIfAbsent
>>> calls for different keys that happen to hash to the same bin.
>>> These conceptually ought to deadlock. Instead they will
>>> sometimes happen to "work" but other times loop.
>>
>>
>>  If changes like the above removed the live lock issue, then I'd be much
>> happier with a resulting deadlock remain unsolved. That's much easier for
>> developers to notice with jdk tooling when debugging server issues and
>> recognize the problem as their fault.
>>
>>  On Sun, Dec 21, 2014 at 10:07 AM, Viktor Klang <viktor.klang at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sun, Dec 21, 2014 at 4:47 PM, Doug Lea <dl at cs.oswego.edu> wrote:
>>>
>>>> On 12/21/2014 09:59 AM, Viktor Klang wrote:
>>>>
>>>>> For "computeIfAbsent"-style maps which are intended for multithreaded
>>>>> use, I
>>>>> tend to prefer to use ConcurrentMap[Key, CompletionStage[Value]],
>>>>>
>>>>
>>>>  Right. Prior to jdk8, we suggested variations of this and other
>>>> techniques that remain preferable in many cases. However, the
>>>> most common CHM user error was trying (but failing) to emulate
>>>> what is now supported as computeIfAbsent. This is an improvement,
>>>> but leads to new (much less common) kinds of potential errors.
>>>> We cannot automatically eliminate or trap all of them. It
>>>> might be possible to do more than we do now, either internally
>>>> (inside CHM) or, more likely, with improved developer tools.
>>>
>>>
>>>  Absolutely, and the fact that there was only blocking Futures in the
>>> JDK at that time.
>>> Having "monadic" Futures like CompletableFuture means you can avoid
>>> blocking completely (assuming that the CHM impl is non-blocking). This is
>>> highly useful when using it as a cache, and especially if there's a very
>>> skewed read distribution, paired with a multitude of readers.
>>>
>>>
>>>>
>>>>
>>>> -Doug
>>>>
>>>> _______________________________________________
>>>> Concurrency-interest mailing list
>>>> Concurrency-interest at cs.oswego.edu
>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>
>>>
>>>
>>>
>>>  --
>>>   Cheers,
>>>>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing listConcurrency-interest at cs.oswego.eduhttp://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141221/c5ec7e51/attachment-0001.html>


More information about the Concurrency-interest mailing list