[concurrency-interest] Concurrent tries as a cache
Rémi Forax
forax at univ-mlv.fr
Sun Dec 11 09:13:24 EST 2011
Hi guys,
I'm trying to implement a fast multi-dispatch on top of invokedynamic.
If the callsite is megamorphic, I would like to implement a table based
solutions
like MRD [1], SRP [2] or TABA [3]. All these algorithms rely on the fact
that it's possible
to associate an object or an int to a Class.
So I need to implement a kind concurrent map (Class -> int) but
I know that an entry will be never modified nor removed after being added
and I know that there will be a lot of get and very few updates.
In fact, the data structure acts more like a cache, i.e if I can't find
the value for a specific class, I have a way to find it based on the other
entries that are already in the map.
I've first written a class that use a lock for that, it works but as you
can guess,
it's slow.
http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/SmallClassBitMap.java
After, I've tried to remove the lock and use an array of array. The main
array is an AtomicReferenceArray,
and the secondary arrays are the ones that store collisions and that use
copy-on-write semantics
when they need to store a new key/value entry.
But because the main array also need to grow, the reference to the main
array is also
a volatile field. So accessing to a value requires two volatiles reads.
This implementation is faster than the data structure that uses a lock
but still slow.
So I try to use a tries instead of a hashmap.
The idea is to have a persistent trie, like the one used in Clojure [4]
so all mutations
re-create a new trie that share parts with the ole one and to use a
volatile field
(and a CAS) to store the root of the tries.
http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/util/ConcurrentTries.java
This code is faster, but I'm not sure it is correct and I wonder if
there is a way to remove the volatile field because the data structure
acts as a cache
so if another thread don't see the modification, it's not a big deal
because it can always
process the value by using the already existing entries.
Rémi
[1] http://www.ifs.uni-linz.ac.at/~ecoop/cd/papers/1628/16280304.pdf
[2] http://webdocs.cs.ualberta.ca/~paullu/Papers/COOTS2001-HTML/mdj.html
[3] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.7337
[4] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
More information about the Concurrency-interest
mailing list