[concurrency-interest] A concurrent, linked HashMap impl
jim.andreou at gmail.com
Thu May 28 19:40:46 EDT 2009
I'm not sure I'm reading this correctly, but...
2009/5/27 Ben Manes <ben_manes at yahoo.com>:
> Whether to use a hash table is, by itself, a trade-off in usage. For
> example, on the Ehcache forum a user reported a performance problem with a
> cache of 60M entries. With a 32-bit hashcode, there is a 99% probability of
> a collision at 200k entries. With CHM's maximum of 2^16 segments, the
> average length of the link chain is 915 entries.
If you have 2^16 segments, then indeed each segment averages to 915
entries. But a segment itself is a hash table, not a single link
chain. (Unless all 915 entries would conflict! Kind of unlikely...).
In any case, I fail to see how the number of segments affect the
number of collisions.
Do I miss something here?
> The performance problem
> was most likely due to traversing this list on every operation. While I
> would argue that 60M entries is an abusive use of the data structure, it
> shows that CHM is not always the appropriate data store. If instead the
> data was stored in a ConcurrentSkipListMap then the height (# of
> comparisons) is a mere 26. There are other ways to solve it, but it shows
> nicely how you must chose your data structures wisely.
Given the above, I think that a CSLM would actually be slower than
CHM, not faster.
More information about the Concurrency-interest