[concurrency-interest] Concurrent tries as a cache

Rémi Forax forax at univ-mlv.fr
Sun Dec 11 09:13:24 EST 2011


Hi guys,
I'm trying to implement a fast multi-dispatch on top of invokedynamic.

If the callsite is megamorphic, I would like to implement a table based 
solutions
like MRD [1], SRP [2] or TABA [3]. All these algorithms rely on the fact 
that it's possible
to associate an object or an int to a Class.

So I need to implement a kind concurrent map (Class -> int) but
I know that an entry will be never modified nor removed after being added
and I know that there will be a lot of get and very few updates.
In fact, the data structure acts more like a cache, i.e if I can't find
the value for a specific class, I have a way to find it based on the other
entries that are already in the map.

I've first written a class that use a lock for that, it works but as you 
can guess,
it's slow.
http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/SmallClassBitMap.java

After, I've tried to remove the lock and use an array of array. The main 
array is an AtomicReferenceArray,
and the secondary arrays are the ones that store collisions and that use 
copy-on-write semantics
when they need to store a new key/value entry.
But because the main array also need to grow, the reference to the main 
array is also
a volatile field. So accessing to a value requires two volatiles reads.
This implementation is faster than the data structure that uses a lock
but still slow.

So I try to use a tries instead of a hashmap.
The idea is to have a persistent trie, like the one used in Clojure [4] 
so all mutations
re-create a new trie that share parts with the ole one and to use a 
volatile field
(and a CAS) to store the root of the tries.
http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/util/ConcurrentTries.java

This code is faster, but I'm not sure it is correct and I wonder if
there is a way to remove the volatile field because the data structure 
acts as a cache
so if another thread don't see the modification, it's not a big deal 
because it can always
process the value by using the already existing entries.

Rémi

[1] http://www.ifs.uni-linz.ac.at/~ecoop/cd/papers/1628/16280304.pdf
[2] http://webdocs.cs.ualberta.ca/~paullu/Papers/COOTS2001-HTML/mdj.html
[3] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.7337
[4] https://en.wikipedia.org/wiki/Hash_array_mapped_trie



More information about the Concurrency-interest mailing list