[concurrency-interest] ConcurrentHashMap-Very Strange behavior of ReentrantLock under contention.

Mudit Verma mudit.f2004912 at gmail.com
Thu Apr 18 08:58:41 EDT 2013


I got it now. I didn't check that. I will check.

 By the way, one interesting thing, the same behavior was seen in posix
locks in C by a different person while working on another problem. The
results are also published.   There also,  with the increase in ThinkTime,
somehow per operation time increases.



On Thu, Apr 18, 2013 at 2:43 PM, Mudit Verma <mudit.f2004912 at gmail.com>wrote:

> Thread.Park ?  What is that? I'm not aware of it.
>
>
> On Thu, Apr 18, 2013 at 1:59 PM, oleksandr otenko <
> oleksandr.otenko at oracle.com> wrote:
>
>>  That is a good point. However, what is that "Thread.park" in this test.
>>
>> If you do not have contention for CPU, it is a condvar wait with similar
>> properties - who do you schedule to run next? Of course, if you have
>> threads waiting to run on the same socket, bias towards those.
>>
>> Alex
>>
>>  On 18/04/2013 12:54, Mudit Verma wrote:
>>
>> I agree with what you say, but I'm not talking about Lock acquisition
>> through CAS.
>>
>> I mentioned in previous mail that the issue is most seen when we don't
>> use CASing and just rely on ReentrantLocking which suspends the threads and
>> grants locks in order the lock requests came. AFAIK by seeing the internal
>> code, this reentrant lock maintains an internal queue of lock requests by
>> different threads.  Much like Posix lock.
>>
>>
>> However, you are perfectly right with your theory, if we just use CASing
>> to acquire the lock, results are not fair at all.  Attached are the
>> results.
>> I
>>
>>
>>  On Thu, Apr 18, 2013 at 1:37 PM, oleksandr otenko <
>> oleksandr.otenko at oracle.com> wrote:
>>
>>>  How do you conclude "pretty fair"?
>>>
>>> When we tested synchronization primitives, we tested threads on the same
>>> socket pinned to Hyper-threaded siblings, same socket distinct physical
>>> cores, and cross-socket. Obviously, we get different throughput because of
>>> latency of CAS.
>>>
>>> At this scale, it is sufficient for lock acquire to only bias towards
>>> same socket. If you have smaller think time, more threads on the same
>>> socket are ready to acquire the lock, and the lock "tick-tocks" between
>>> sockets less frequently - but it does move around. At macro level it will
>>> look fair, but at nano level it is biased.
>>>
>>> As you increase think time, you reduce the number of threads on the same
>>> socket ready to acquire the lock. At the same time, the other socket may
>>> have a higher chance to steal the lock, as the threads there will have
>>> waited longer, and accumulated more threads ready to acquire. So the lock
>>> "tick-tocks" between sockets more frequently. At macro level fairness looks
>>> the same, but at nano level more lock acquires are across the socket, and
>>> inflate the CAS time.
>>>
>>> The cross-socket traffic is not only for the lock, but for the whole
>>> HashMap bucket, so this also inflates the time spent *in* put, making
>>> the put time even longer.
>>>
>>>
>>> It seems very easy to test.
>>>
>>> Alex
>>>
>>>
>>>
>>> On 18/04/2013 12:25, Mudit Verma wrote:
>>>
>>>  I'd say threads are doing pretty fair, across the sockets.  I have
>>> done this testing with Pure Reentrant Locking where the increase in per
>>> operation time with increase in ThinkTime is worst affected.
>>>
>>>  Attached are threads completion time with  ThinkTime 0 and ThinkTime
>>> 10000 (where per op time is highest).
>>>
>>>
>>> On Thu, Apr 18, 2013 at 11:43 AM, oleksandr otenko <
>>> oleksandr.otenko at oracle.com> wrote:
>>>
>>>>  How is fairness doing?
>>>>
>>>> If the threads spin faster, threads on the same socket may have an
>>>> advantage over the other threads:
>>>>
>>>> - lock release invalidates lock state in all caches.
>>>> - lock acquire will work faster from the same socket, because some
>>>> caches are shared.
>>>>
>>>> As you increase think time, you increase the odds of lock acquire to
>>>> cross socket boundary.
>>>>
>>>>
>>>> To test this theory, repeat the test with threads pinned to one socket.
>>>> Then the same number of threads, but pinned across the sockets.
>>>>
>>>> Alex
>>>>
>>>>
>>>>
>>>> On 17/04/2013 14:31, Mudit Verma wrote:
>>>>
>>>>     Hi All,
>>>>
>>>> I recently performed a scalability test (very aggressive and may be not
>>>> practical, anyhow) on put operation of ConcurrentHashMap.
>>>>
>>>>  Test: Each thread is trying to put (Same Key, Random Value) in
>>>> HashMap in a tight loop. Therefore, all the threads will hit the same
>>>> location on hashMap and will cause contention.
>>>>
>>>>  What is more surprising is, when each thread continue to do another
>>>> put one after the other, avg time taken in one put operation is lesser than
>>>> when a thread do some other work between two put operations.
>>>>
>>>> We continue to see the increase in per operation time by increasing the
>>>> work done in between.  This is very counter intuitive. Only after a work of
>>>> about 10,000 - 20,000 cycles in between, per op time comes down.
>>>>
>>>>  When I read the code, I found out that put op first try to use CAS to
>>>> aquire the lock(64 times on multicore), only if it could not acquire the
>>>> lock on segment through CASing, it goes for ReentrantLocking (which suspend
>>>> threads .. ).
>>>>
>>>>  We also tweaked, the number of times CAS (from 0 to 10,000) is used
>>>> before actually going for ReentrantLocking.   Attached is the graph.
>>>>
>>>> One interesting thing to note. As we increase the work between two ops,
>>>> locking with 0 CAS (pure ReentrantLocking) seems to be worst affected with
>>>> the spike. Therefore, I assume that, spike comes from ReentractLocking even
>>>> when there is a mixture of two (CAS+Lock).
>>>>
>>>>  Code Skeleton: For each Thread
>>>>
>>>> While() {
>>>>    hashMap.put(K,randomValue);     // K is same for each thread
>>>>    ThinkTime();    < Ranging from 0 Cycles to 1 million Cycles>
>>>>
>>>> }
>>>>
>>>>  Machine: 48 core NUMA
>>>>  Threads used:  32 (each one is pinned to a core).
>>>>  #ops: In total 51200000 (each thread with 160000 ops)
>>>>
>>>>  That's like saying, if I do something else in between my operations
>>>> (upto a limit), contention will increase.  Very strange.
>>>>
>>>>  Does anyone of you know why is it happening?
>>>>
>>>>
>>>>   _______________________________________________
>>>> Concurrency-interest mailing listConcurrency-interest at cs.oswego.eduhttp://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Concurrency-interest mailing list
>>>> Concurrency-interest at cs.oswego.edu
>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130418/3bebb92d/attachment-0001.html>


More information about the Concurrency-interest mailing list