[concurrency-interest] Consistency guarantees for ConcurrentHashMap

Jorge Ortiz jorge.ortiz at gmail.com
Thu Jun 11 02:15:44 EDT 2009


Consider the following code:

    ConcurrentHashMap<String, Boolean> map = new ...;

    Thread a = new Thread {
        void run() {
            map.put("first", true);
            map.put("second", true);
        }
    };

    Thread b = new Thread {
        void run() {
            map.clear();
        }
    };

    a.start();
    b.start();
    a.join();
    b.join();

I would expect that one of the following scenarios to be true (for the
contents of the map) after this code runs:

    Map("first" -> true, "second" -> true)
    Map("second" -> true)
    Map()

However, upon inspection of ConcurrentHashMap, it seems to me that the
following scenario might also be true:

    Map("first" -> true) ???

This seems surprising because "first" gets put before "second", so if
"second" is cleared, then surely "first" should be cleared too.

Likewise, consider the following code:

    ConcurrentHashMap<String, Boolean> map = new ...;
    List<String> myKeys = new ...;

    Thread a = new Thread {
        void run() {
            map.put("first", true);

            // more stuff

            map.remove("first");
            map.put("second", true);
        }
    };

    Thread b = new Thread {
        void run() {
            Set<String> keys = map.keySet();
            for (String key : keys) {
                myKeys.add(key);
            }
        }
    };

    a.start();
    b.start();
    a.join();
    b.join();

I would expect one of the following scenarios to be true for the contents of
myKeys after this code has run:

    List()
    List("first")
    List("second")

However, upon closer inspection, it seems that the following scenario might
also be true:

    List("first", "second") ???

This is surprising because "second" is only ever put into the map after
"first" is removed. They should never be in the map simultaneously, but an
iterator might perceive them to be so.

Are these interpretations of ConcurrentHashMap's behavior correct?

I ask because I want to add concurrent collections to Scala's standard
library. Ideally, these could just be wrappers around concurrent Java
collections because 1) it saves me a lot of work and 2) it benefits from the
significant effort (and thought and real-world testing) that have gone into
j.u.concurrent.

However, central to Scala's collections are higher-order functions like
"map" and "filter" that operate on an entire collection and return a new
collection with elements derived from the elements in the old collections.
These methods necessarily operate on an entire collection and are frequently
implemented in terms of iterators. Without strongly consistent iterators or
some other way to provide a consistent "snapshot" of ConcurrentHashMap,
"map" and "filter" are... well, pretty useless.

If my interpretation of ConcurrentHashMap is correct, are there are other
concurrent collections that provide such a "snapshot" view?

Could ConcurrentHashMap be wrapped with a ReadWriteLock, where
single-element operations (get, put, etc) acquire the read lock and
whole-collections operations (map, filter, clear, etc) acquire the write
lock? How badly would this hurt performance in the no-writers case?

Thanks,

--j
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20090610/19911177/attachment.html>


More information about the Concurrency-interest mailing list