[concurrency-interest] Questions on ConcurrentHashMap

Joseph Seigh jseigh_cp00 at xemaps.com
Thu Nov 29 06:11:50 EST 2007

David Holmes wrote:
> Joe,
>> Joe Bowbeer wrote:
>> Great! I go check the docs to see what the semantics of map.clear()
>> actually are in this case.  It doesn't say.  So how does this work?
>> Semantics are implementation dependent.  Is the implementation
>> fixed for all time or can it change?
> The docs are a little brief here :) I think all it can say is that after
> clear() returns, every mapping that was present when clear() "commenced",
> will have been removed. New mappings concurrent with clear() may or may not
> have been removed. But even this is a loose description as "commenced" is
> really determined by the locking strategy.
> The implementation could certainly change in theory. Eg we might lock all
> segments then clear all, rather than locking, clearing, unlocking, one at a
> time. In practice it is unlikely to change.
> The atomicity (or not) of CHM methods are often subject to debate.

I found the actual documentation later.  It's up in the into section, 
not down in the method
documentation which is apparently an artifact of how javadoc works.  You 
implement a
interface methods and you have to have something there, right?  So just 
copy it from the
interface.  Probably 90% of the collections documentation is just 
repeated boilerplate.

Sidestepping the issue of almost nobody knowing how to specify semantics 
for concurrency,
one of the few things that is understood is atomicity.   So 
hypothetically you could say
something like all mutating operations are atomic w.r.t. other mutating 
operations, even
the aggregate ones.  And all non aggregate read operations are atomic 
w.r.t. all non aggregate
operations, both read and mutating.  Something like that.

This is where transactional memory might help.  It may not scale but it 
will let you implement
collections where all operations are atomic w.r.t. each other.  That 
simplifies semantics somewhat.
Except for the fact that the java collections api is bit over elaborate 
to put it mildly.  With too
many methods you put users a lot nearer to something that's the 
equivalent of distributed
programming with volatile ints.  It can be done but how many average 
programmers can do it?
For a concurrent queue you should just have push() and pop() and that's 
it.  Even there some
programmers can screw it up since it has state.

You can undo this by specifying your own simplified collection api's and 
"extending" existing
java collections with the simplified api.  I've done this for comparison 
testing where I didn't
want to implement set operations for a queue implementation since the 
set operations would
never be used and it would have been pointlessly stupid to implement them. 

Joe Seigh

More information about the Concurrency-interest mailing list