[concurrency-interest] ConcurrentHashMapV8

David M. Lloyd david.lloyd at redhat.com
Thu Oct 6 17:39:14 EDT 2011

On 10/06/2011 06:30 AM, Doug Lea wrote:
> On 10/05/11 18:31, David M. Lloyd wrote:
>>> In the case of ConcurrentHashMap, this means that the
>>> kind of Entry you get for the elements of entrySet.toArray
>>> should NOT write back to the map you got them from.
>>> In other words:
>>> Object[] a = myMap.entrySet().toArray();
>>> (Entry[])a[0].setValue(17);
>>> should not change myMap.get(keyFor0) to return 17.
>> I disagree with this conclusion; these scenarios aren't really the
>> same. The
>> general contract of toArray() is to return an array consisting of all
>> the values
>> in the collection - I think interpreting it such that it can return
>> copies of
>> the values is a stretch. I can't think of any case where we'd actually
>> copy the
>> values in normal collections;
> Never deep copies. But always shallow copies. For example
> Object[] a = myMap.values().toArray();
> a[0] = 17;
> also does not change myMap.get(keyFor0) to return 17 in
> any JDK Map implementation.

Yes, but this:

Object[] a = myMap.entrySet().toArray();
a[0] = "Foo";

doesn't change the mapping that happens to be at that element either, 
which is much more analogous to your example in my opinion.

> Besides not wanting to set up side-channels for modifications,
> the various toArray methods must have snapshot-like semantics
> because they cannot possibly track additions and removals
> occurring after (or concurrently with) the array creation.
> This is the basis for the the lifetime disclaimers in the
> Map.Entry specs.

> Several people seem to have the feeling that entrySet().toArray()
> should be somehow different about these policies because they know
> that HashMap in particular returns internal mutable Entry objects.
> Which is arguably also a bad idea. But most other Maps do not
> do this. They instead return distinct Entry objects
> that snapshot current mappings while usually still supporting
> setValue for one-by-one traversal in the same way that CHM does,
> invoking map.put. Which is a little bit crazy -- we add some
> time/space overhead to entrySet operations just to imperfectly
> reinforce the questionable intuition that methods operate
> directly on internal Entry objects that don't actually exist
> (and need not actually exist wrt any other aspects of Map specs).

But if the Entry actually *did* reflect the actual physical entry, then 
the contract for the Entry interface makes a lot more sense.  You can 
retrieve the key always, and the value if it hasn't been removed (if it 
has you could give IllegalStateException as is allowed by spec).  The 
setValue() method translates more or less directly to an atomic 
getAndSet() or get+CAS (this method also may throw ISE in the event of 
removal).  You could even use the Entry objects for a primitive sort of 
polling of map state, and it's easy to define their behavior for an 
arbitrary lifespan without any hinkiness (i.e. no need to tie the 
lifespan to the iterator from whence it came, which doesn't make sense 
to me anyway, least of all in a concurrent setting).  I think it's 
perfectly reasonable to provide an Entry which is more forgiving than 
the spec requires, especially if you can give it really clear semantics.

In one of my scratchpads I have a lockless* ConcurrentMap which does 
exactly this (have its entries implement Entry, that is) implemented in 
terms of an atomic ref array of (immutable) arrays of entries.  I hope 
to have time to run it through some more testing in the next couple 
weeks, to see how it compares with what's on the table.  So it's at 
least possible... we'll see about practical. :)

The other extreme is the copy-on-write type of map where an iterator 
would reflect a snapshot and Entries would be immutable.  But this is 
also easy to make well-defined for an arbitrary lifespan.

* well, okay, it does lock to block writers during resize, so it's not 
100% lockless.

More information about the Concurrency-interest mailing list