[concurrency-interest] ConcurrentHashMap-Very Strange behavior of ReentrantLock under contention.

oleksandr otenko oleksandr.otenko at oracle.com
Thu Apr 18 07:37:22 EDT 2013

How do you conclude "pretty fair"?

When we tested synchronization primitives, we tested threads on the same 
socket pinned to Hyper-threaded siblings, same socket distinct physical 
cores, and cross-socket. Obviously, we get different throughput because 
of latency of CAS.

At this scale, it is sufficient for lock acquire to only bias towards 
same socket. If you have smaller think time, more threads on the same 
socket are ready to acquire the lock, and the lock "tick-tocks" between 
sockets less frequently - but it does move around. At macro level it 
will look fair, but at nano level it is biased.

As you increase think time, you reduce the number of threads on the same 
socket ready to acquire the lock. At the same time, the other socket may 
have a higher chance to steal the lock, as the threads there will have 
waited longer, and accumulated more threads ready to acquire. So the 
lock "tick-tocks" between sockets more frequently. At macro level 
fairness looks the same, but at nano level more lock acquires are across 
the socket, and inflate the CAS time.

The cross-socket traffic is not only for the lock, but for the whole 
HashMap bucket, so this also inflates the time spent /*in*/ put, making 
the put time even longer.

It seems very easy to test.


On 18/04/2013 12:25, Mudit Verma wrote:
> I'd say threads are doing pretty fair, across the sockets. I have done 
> this testing with Pure Reentrant Locking where the increase in per 
> operation time with increase in ThinkTime is worst affected.
> Attached are threads completion time with  ThinkTime 0 and ThinkTime 
> 10000 (where per op time is highest).
> On Thu, Apr 18, 2013 at 11:43 AM, oleksandr otenko 
> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
>     How is fairness doing?
>     If the threads spin faster, threads on the same socket may have an
>     advantage over the other threads:
>     - lock release invalidates lock state in all caches.
>     - lock acquire will work faster from the same socket, because some
>     caches are shared.
>     As you increase think time, you increase the odds of lock acquire
>     to cross socket boundary.
>     To test this theory, repeat the test with threads pinned to one
>     socket.
>     Then the same number of threads, but pinned across the sockets.
>     Alex
>     On 17/04/2013 14:31, Mudit Verma wrote:
>>     Hi All,
>>     I recently performed a scalability test (very aggressive and may
>>     be not practical, anyhow) on put operation of ConcurrentHashMap.
>>     Test: Each thread is trying to put (Same Key, Random Value) in
>>     HashMap in a tight loop. Therefore, all the threads will hit the
>>     same location on hashMap and will cause contention.
>>     What is more surprising is, when each thread continue to do
>>     another put one after the other, avg time taken in one put
>>     operation is lesser than when a thread do some other work between
>>     two put operations.
>>     We continue to see the increase in per operation time by
>>     increasing the work done in between.  This is very counter
>>     intuitive. Only after a work of about 10,000 - 20,000 cycles in
>>     between, per op time comes down.
>>     When I read the code, I found out that put op first try to use
>>     CAS to aquire the lock(64 times on multicore), only if it could
>>     not acquire the lock on segment through CASing, it goes for
>>     ReentrantLocking (which suspend threads .. ).
>>     We also tweaked, the number of times CAS (from 0 to 10,000) is
>>     used before actually going for ReentrantLocking. Attached is the
>>     graph.
>>     One interesting thing to note. As we increase the work between
>>     two ops, locking with 0 CAS (pure ReentrantLocking) seems to be
>>     worst affected with the spike. Therefore, I assume that, spike
>>     comes from ReentractLocking even when there is a mixture of two
>>     (CAS+Lock).
>>     Code Skeleton: For each Thread
>>     While() {
>>       hashMap.put(K,randomValue);     // K is same for each thread
>>       ThinkTime();    < Ranging from 0 Cycles to 1 million Cycles>
>>     }
>>      Machine: 48 core NUMA
>>     Threads used:  32 (each one is pinned to a core).
>>     #ops: In total 51200000 (each thread with 160000 ops)
>>     That's like saying, if I do something else in between my
>>     operations (upto a limit), contention will increase.  Very strange.
>>     Does anyone of you know why is it happening?
>>     _______________________________________________
>>     Concurrency-interest mailing list
>>     Concurrency-interest at cs.oswego.edu  <mailto:Concurrency-interest at cs.oswego.edu>
>>     http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>     _______________________________________________
>     Concurrency-interest mailing list
>     Concurrency-interest at cs.oswego.edu
>     <mailto:Concurrency-interest at cs.oswego.edu>
>     http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130418/3299b870/attachment-0001.html>

More information about the Concurrency-interest mailing list