[concurrency-interest] ConcurrentHashMapV8

Doug Lea dl at cs.oswego.edu
Mon Dec 26 10:58:17 EST 2011


On 12/24/11 15:09, Dr Heinz M. Kabutz wrote:
>  >From my first tests on my little 8-core system, it seems that for small table
> sizes, CHM v8 is the clear winner.
>
> But if the table is a bit larger, say 1000000 entries, then Cliff Click's
> NonBlockingHashMap takes the lead.

This is consistent with what I see. We have 6 test machines, 2 each
Intel, AMD, and Sparc, ranging from 8 to 128 hardware threads, that
we regularly run most of the approx 100 performance checks in
our CVS src/test/loops. (Plus accounts on other machines out there
for less frequent tests.) See near the end of
http://gee.cs.oswego.edu/dl/concurrency-interest/index.html
for notes on getting and using these. Cliff Click's test (that was the basis
of yours) isn't in our CVS but we occasionally run it too.

Testing performance of hash tables requires care.
There are many reasons that a single test program can be
uninformative. For example, some tests just amount to
measuring a platform's memory system throughput under
constant cache misses. Some have key types,
locality/non-locality of key or value use, or operation mixes that
are not representative of most uses.  For NON-concurrent hash table
benchmarks, there seems to be enough rough consensus about
what makes for representativeness that we put together
MapMicroBenchmark (in test/loops) that we mostly believe.
For concurrent ones, there still seems to be too much diversity
to treat any particular test program as definitive.
In general though, I've found that CHMV8 does better than
NonBlockingHashMap except in two cases: (1) When the table
is large and there is little locality of key use -- here,
NonBlockingHashMap's more compact key array representation
pays off; and (2) when the table needs to expand rapidly due
to insertions by multiple threads -- here, NonBlockingHashMap's
help-out approach can work better than CHMV8's approach of
a single resizer while the old table is concurrently usable.
Neither of these scenarios seems common enough to outweigh
the CHMV8's advantages in other cases.


> Seems that for empty maps, the standard CHM with 4096 would need *16544* bytes
> and for the Version 8 of CHM would need *72* bytes.  It would be interesting to
> try out how the Version 8 CHM grows as we add more elements :-)

It averages approximately 8 * N words (where N is number of elements),
with a large variance because table size expands only at powers of two.

It is worth noting though that probably the most consistent
finding across all sorts of performance tests for all sorts of
hash tables (and across all sorts of languages)
is that providing a good size estimate in constructor arguments
is the single best thing users can do to improve performance.
We employ many clever techniques to cope when there are no
estimates, but none of them are as effective as good user guidance.

-Doug




More information about the Concurrency-interest mailing list