[concurrency-interest] ConcurrentHashMapV8

Ben Manes ben_manes at yahoo.com
Mon Aug 29 14:37:34 EDT 2011


> Thanks, but the plain version of CHM doesn't itself deal with Futures

MapMaker's entry objects are basically light-weight futures and we plan on adding bulk computation eventually. Since computation is done by holding the node's lock and inserting it prior to computation, I don't see why a bulk computation would be much different.

> If the user function in turn calls computeIfAbsent for the same key, then it seems irreparably broken in the same sense as, for example. a user's equals() function that attempts to invoke map.put.


Its broken, but failing fast helps identify the issue and is easier to recover from than livelocks or deadlocks are. Any computation function that updates the map itself is wrong, but will occur in the wild and surprise the developer when it seems to randomly (and silently) break.


________________________________
From: Doug Lea <dl at cs.oswego.edu>
To: concurrency-interest at cs.oswego.edu
Sent: Monday, August 29, 2011 10:37 AM
Subject: Re: [concurrency-interest] ConcurrentHashMapV8

Thanks for all the on- and off- list comments and suggestions!
I committed an update with various improvements.
Some follow-ups:

On 08/28/11 17:42, Joe Bowbeer wrote:
> Given the restrictions placed on computeIfAbsent's function (invoked
> atomically, short & simple, must not attempt to update any other mappings), I
> would like to see a more complete example.

Yes, the javadoc could use an example. Suggestions for a
tiny but informative one would be welcome.

On 08/28/11 17:00, Ben Manes wrote:
> If computation is being added, then bulk computation should be implemented as
> well. This is useful when a computation requires an I/O call, so a batch
> operation is an important optimization. The simplest approach is to insert
> placeholder nodes that are populated after the computation has completed.

Thanks, but the plain version of CHM doesn't itself deal with Futures (aka
placeholders). However, a cache version might do so, possibly based on
a CHM<K, Future<V>>.

> It wasn't obvious to me how recursive computations are handled for the same
> key and appears to livelock. This is an error condition that used to
> deadlocked MapMaker. I fixed this by failing fast if Thread.holdsLock() was
> true prior to waiting for the computation to finish.

I'm not sure I follow. If the user function in turn calls computeIfAbsent
for the same key, then it seems irreparably broken in the same sense as,
for example. a user's equals() function that attempts to invoke map.put.

>
> The exceptional computation case may be worth documenting. CHM only fails

It is -- an exception leaves mapping untouched. The more arbitrary decision
is what to do if the computation returns null. ComputeIfAbsent could be defined
to throw NullPointerException, but this is a fairly heavy response to the
case where it turns out that the key has no mapping, since a null return
provides the same information.

> the computing thread and allows the waiting threads to retry the
> computation. In other implementations the waiting threads rethrow the
> exception and don't recompute.

Yes; I think this is the difference between a plain CHM and one that
uses Futures as values.

>
> A recompute (a.k.a. refresh) method may also be useful, though can be
> adequately supplied by usages. This recomputes the value, but allows readers
>  to continue to use the stale value until it completes. This can be useful
> for cache invalidation where it is preferred vs. exposing the latency to the
> user (if naively done by removing first).
>

Thanks! Yes, this (i.e. compute(k, f) vs computeIfAbsent(k, f))
is easy to add, and provides a symmetical API to the two forms
of put, and has some uses, so I added it.

On 08/28/11 22:52, Dane Foster wrote:
> Have you had an opportunity to benchmark this against Cliff Click's
> concurrent hash map?
>

Yes. Results vary across different kinds of tests, depending on operation
mixes, key types, temporal locality of lookups, etc, so I can't
give a better answer than: each is sometimes faster (and
sometimes by a lot) than the other. But the previous scalability
limits due to use of Segments is now gone, so the performance differences
now mostly reflect other design tradeoffs.

-Doug




_______________________________________________
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu
http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20110829/4fb9891f/attachment.html>


More information about the Concurrency-interest mailing list