[concurrency-interest] jsr166e.CHMv8.TreeBin bug ?

Alex Snaps alex.snaps at gmail.com
Sat Jun 1 10:31:28 EDT 2013

We've been starting using CHMv8 at Terracotta a little while back to power
some of Ehcache's data structure. Back in Dec. we found some entries "being
lost", but hunting the bug down, it turned out it got fixed at the very
same time. But we kept the test around (and more actually, since we were
actually not using the vanilla CHMv8).
Sadly a slight variant of it seems to still be in current code. It only
bubbled up now, after months on continuous testing in our CI environment.
The bug manifest when having lots of colliding in hash codes or rather even
identical ones.
During a debug session with my colleagues Chris & Louis, we believe we've
hunted it down to how the tree is walked. More specifically,
in jsr166e.ConcurrentHashMapV8.TreeBin.getTreeNode, this last inner bit is
probably the culprit:

                            TreeNode<V> r = null, pl, pr; // check both
                            if ((pr = p.right) != null && h >= pr.hash &&
                                (r = getTreeNode(h, k, pr)) != null)
                                return r;
                            else if ((pl = p.left) != null && h <= pl.hash)
                                dir = -1;
                            else // nothing there
                                return null;

On the second line above (line 834 in jsr166e.CHMv8) the second condition
doesn't seem right. If we understood it right, both left and right might
have identical hashes to p.hash, so that left might need to be checked too.
Feels like this condition would apply if duplicated hashes weren't an
option, but sadly they are.

Attached is the test I quickly ported from Ehcache's tests that exposes the
issue and the one we used to track it down on jsr166e code. It certainly
could be ported to only test TreeNode now that we believe we've narrowed it
down to being the culprit (and using less entries!).

Basically it shows how EvilKey ( "6137" ) can't be retrieved anymore. I
actually becomes "missing" when inserting the entry for i ==
8372 (ConcurrentHashMapV8$1) and the tree gets rebalanced. As far as we
could tell the rebalancing doesn't break any invariance... So
that jsr166e.ConcurrentHashMapV8.TreeBin.getTreeNode seems to be where the
problem lies. Actually putTreeNode probably shares the issue, even though
this test doesn't expose this (as p isn't the root of the tree in this
case). Looks like this "h >= pr.hash" condition should just not be there...
Hope this makes sense and helps.

Alex Snaps <alex.snaps at gmail.com>
Principal Software Engineer - Terracotta
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130601/379e1c7a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ConcurrentHashMapV8Test.java
Type: application/octet-stream
Size: 2434 bytes
Desc: not available
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130601/379e1c7a/attachment.obj>

More information about the Concurrency-interest mailing list