[concurrency-interest] Concurrent tries as a cache

Rémi Forax forax at univ-mlv.fr
Sun Dec 11 13:41:37 EST 2011


On 12/11/2011 04:20 PM, Michael Barker wrote:
> Hi Remi,
>
> It would be worthwhile benchmarking your implementation against
> ConcurrentHashMap.  I did something similar a while back, creating a
> ConcurrentMap using a Bagwell-style tree.  The quantity of my
> implementation notwithstanding, I was unable to bring the performance
> up above ConcurrentHashMap.  Profiling was inconclusive, but suggested
> that a combination of memory throughput demands and garbage generated
> by the array creation and copying when nodes were added was the main
> performance cost.  It outweighed the costs of the volatile/CAS
> operations.
>
> Details are here:
> http://mikes-tech.blogspot.com/2011/02/non-blocking-concurrenthashmaptrie.html
>
> Michael Barker.,

Hi Michael,
I don't think my implementation is better/worst than ConcurrentHashMap
because it's not a replacement of ConcurrentHashMap :)

I need a data-structure that acts as a cache with no eviction (never !)
and where calls to lookup/get outnumber by a wide margin calls to
add/put (let say 10000/1).

The way I test my implementations is to replace all method calls of the 
DaCapo
benchmark by an invokedynamic that does the multi-dispatch
(in fact some aren't replaced because the algorithm isn't finished).
So I know if a data structure is better than the other
for my use case, not in general.

Now, reading your blog confirms my intuition,
if there is a lot of read and few write, a concurrent tries is perhaps 
better.
Also, about the possible virtual calls because of the different kinds of 
Node,
there is no ListNode in my implementation, so there is only two 
implementations
of Node that fits in the bimorphic inlining caches used by the VM
so the code is inlined :)

Rémi

>
> On Sun, Dec 11, 2011 at 2:13 PM, Rémi Forax<forax at univ-mlv.fr>  wrote:
>> Hi guys,
>> I'm trying to implement a fast multi-dispatch on top of invokedynamic.
>>
>> If the callsite is megamorphic, I would like to implement a table based
>> solutions
>> like MRD [1], SRP [2] or TABA [3]. All these algorithms rely on the fact
>> that it's possible
>> to associate an object or an int to a Class.
>>
>> So I need to implement a kind concurrent map (Class ->  int) but
>> I know that an entry will be never modified nor removed after being added
>> and I know that there will be a lot of get and very few updates.
>> In fact, the data structure acts more like a cache, i.e if I can't find
>> the value for a specific class, I have a way to find it based on the other
>> entries that are already in the map.
>>
>> I've first written a class that use a lock for that, it works but as you can
>> guess,
>> it's slow.
>> http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/SmallClassBitMap.java
>>
>> After, I've tried to remove the lock and use an array of array. The main
>> array is an AtomicReferenceArray,
>> and the secondary arrays are the ones that store collisions and that use
>> copy-on-write semantics
>> when they need to store a new key/value entry.
>> But because the main array also need to grow, the reference to the main
>> array is also
>> a volatile field. So accessing to a value requires two volatiles reads.
>> This implementation is faster than the data structure that uses a lock
>> but still slow.
>>
>> So I try to use a tries instead of a hashmap.
>> The idea is to have a persistent trie, like the one used in Clojure [4] so
>> all mutations
>> re-create a new trie that share parts with the ole one and to use a volatile
>> field
>> (and a CAS) to store the root of the tries.
>> http://code.google.com/p/jsr292-cookbook/source/browse/trunk/multi-dispatch/src/jsr292/cookbook/mdispatch/util/ConcurrentTries.java
>>
>> This code is faster, but I'm not sure it is correct and I wonder if
>> there is a way to remove the volatile field because the data structure acts
>> as a cache
>> so if another thread don't see the modification, it's not a big deal because
>> it can always
>> process the value by using the already existing entries.
>>
>> Rémi
>>
>> [1] http://www.ifs.uni-linz.ac.at/~ecoop/cd/papers/1628/16280304.pdf
>> [2] http://webdocs.cs.ualberta.ca/~paullu/Papers/COOTS2001-HTML/mdj.html
>> [3] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.7337
>> [4] https://en.wikipedia.org/wiki/Hash_array_mapped_trie
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest



More information about the Concurrency-interest mailing list