David M. Lloyd
david.lloyd at redhat.com
Fri Oct 7 10:32:42 EDT 2011
On 10/07/2011 05:41 AM, Doug Lea wrote:
> On 10/06/11 17:39, David M. Lloyd wrote:
>> -- we add some
>>> time/space overhead to entrySet operations just to imperfectly
>>> reinforce the questionable intuition that methods operate
>>> directly on internal Entry objects that don't actually exist
>>> (and need not actually exist wrt any other aspects of Map specs).
>> But if the Entry actually *did* reflect the actual physical entry,
>> then the
>> contract for the Entry interface makes a lot more sense. You can
>> retrieve the
>> key always, and the value if it hasn't been removed (if it has you
>> could give
>> IllegalStateException as is allowed by spec).
> We considered and rejected this approach back in JDK5 when deciding
> upon iterator semantics for concurrent collections. If you allow
> hasNext() to lie, claiming that an element exists but then throwing
> an exception in next() because it no longer exists, then most
> client iterator usages break. So instead we ensure that if
> hasNext saw a valid element, then it is snapshotted as the one
> returned in next(). Since these elements can concurrently come
> and go at any time, receiving this (weakly consistent) snapshot
> is as much as you can expect anyway. Even if you exported
> a "live" entry, it could disappear as soon as client reads it
> but before it acts upon it.
Yeah, but I think that's not outside of what should be reasonably
expected when you iterate a (non-COW) concurrent hash map. As you say,
hasNext()/next() can be made consistent by actually getting a snapshot
ref to the next entry in hasNext(), but it doesn't have to be a copy of
the Entry, it can still be the literal Entry in the map.
Using a "real" Entry, if the user calls getValue() on the entry but it
has since been deleted, you'd get an ISE regardless of whether it came
from an array, iterator, or reflection right into the guts of the map,
which is very consistent behavior.
If we wanted to really follow the precedents that exist in the
collections lib, we'd have a ConcurrentEntry which had more atomic
operations on it, like setValueIfAbsent(), replaceValue(), remove(),
getValueIfPresent(), etc. But I imagine that ship has sailed.
For the other set types, you'd probably wouldn't have to throw ISE
because the keys don't usually go anywhere and you can train the value
iterator to pass over missing entries and snapshot value refs, so even
if they're deleted in the meantime the behavior is as "nice" as can be
expected, if not more so.
> The underlying problem is that the non-atomic hasNext()/next()
> pairing in Iterator is inherently concurrency-hostile.
> But we didn't want to replace it with something else.
> However, with upcoming lambdas and bulk operations, I'm
> considering creating a variant of CHM that does not support
> iterators at all, but instead several variants of
> method forEach(entry -> action) etc.
Agreed that lambdas will greatly improve the situation, but I'm not sure
it warrants shooting down CHM iterators altogether.
More information about the Concurrency-interest