[concurrency-interest] ReentrantReadWriteLock in inconsistent state

Phil Harvey phil at philharveyonline.com
Thu Aug 2 04:15:09 EDT 2012


Thanks for the advice guys.

I've checked our code and can confirm we make no Thread.stop() or
Thread.stop(Throwable) calls.

Also, we would have seen the stack trace of a StackOverflowError in our
logs.

So I still have no idea what caused this problem. I can only assume it's a
Java bug. Or am I jumping to conclusions  prematurely?

Phil
 On Aug 2, 2012 5:14 AM, "Stanimir Simeonoff" <stanimir at riflexo.com> wrote:

> David,
> I am quite positive it's Thread.stop, as setState is inlined. I've seen
> that case due to Thread.stop quite a few times too.
> Even though it's possible to avoid the disaster via some awkward steps
> like: waiting for sleep mode/examine the stack trace, followed by
> Thread.suspend/check again, then stop(). Alternatively peppering the code
> w/ stop points during class loading is an option but a hard one.
>
> That has made me wonder if hotspot can prevent adding safe points in
> java.util.concurrent.locks classes, or at least the safe point to skip
> checking for Thread.stop outside park(). That's it the only safe point
> would be park(), as side effect it can have minor performance benefits.
>
> I know Thread.stop is deprecated but still there is enough middleware that
> makes use of.
>
> Stanimir
>
> On Thu, Aug 2, 2012 at 5:56 AM, <davidcholmes at aapt.net.au> wrote:
>
>> Phil,
>>
>> A RRWL that has no owner but can not be locked is definitely a problem.
>> If this is not 6822370 then the other possibilities are async-exceptions
>> occurring in the release code:
>>
>>             if (free)
>>                 setExclusiveOwnerThread(null);
>>             <=== async exception here
>>             setState(nextc);
>>
>> Two possible sources of the async exception:
>>
>> a) Use of Thread.stop elsewhere
>> b) StackOverflowException was triggered trying to call setState
>>
>> David Holmes
>> ------------
>>
>> Quoting Phil Harvey <phil at philharveyonline.com>:
>>
>>  Hi,
>>>
>>> Yes, we had looked at that bug but assumed we were not experiencing it
>>> here
>>> because we are using Java 1.6.0_25, and it was reported fixed in
>>> 1.6.0_18.
>>>
>>> Do you agree that the unusual state of the ReentrantReadWriteLock
>>> suggests
>>> we've hit a bug?
>>>
>>> Phil
>>> On Aug 1, 2012 3:05 PM, "Ariel Weisberg" <ariel at weisberg.ws> wrote:
>>>
>>>    Hi,
>>>>
>>>>  I remember that. That was fixed Oracle JDK 1.6.0_18. It hasn't been
>>>> reproducing for us since 1.6.0_18, but I am not sure if we are using
>>>> ReentrantLock in the same way anymore.
>>>>
>>>>  The reproducer we used was
>>>> https://github.com/VoltDB/**voltdb/tree/master/tools/lbd_**lock_test<https://github.com/VoltDB/voltdb/tree/master/tools/lbd_lock_test>
>>>>  If I remember correctly it prints '.' as it goes and when it hangs it
>>>> stops printing dots.
>>>>
>>>>  Regards,
>>>>  Ariel
>>>>
>>>>  On Wed, Aug 1, 2012, at 09:27 AM, ?iktor ?lang wrote:
>>>>
>>>>  Hi Phil,
>>>>
>>>> Related to this?
>>>> http://bugs.sun.com/view_bug.**do?bug_id=6822370<http://bugs.sun.com/view_bug.do?bug_id=6822370>
>>>>
>>>>  Cheers,
>>>>  ?
>>>>
>>>>  On Wed, Aug 1, 2012 at 3:20 PM, Phil Harvey  <
>>>> phil at philharveyonline.com>**wrote:
>>>>
>>>>  We had a deadlock-like failure of our application recently.
>>>>
>>>> I initially reported it on the BDB JE forum (
>>>> https://forums.oracle.com/**forums/thread.jspa?messageID=**10480988<https://forums.oracle.com/forums/thread.jspa?messageID=10480988>)
>>>> but
>>>> further analysis of the heap and thread dumps has exposed a problem that
>>>> looks like a Java locking bug. I'm hoping you can offer advice on
>>>> whether
>>>> this is the case.
>>>>
>>>> We?re using Oracle JVM 1.6.0_25-b06, running on Linux version:
>>>>
>>>> 2.6.18-194.32.1.el5.
>>>>
>>>> We are launching Java as follows: java -server -XX:+UseConcMarkSweepGC
>>>> -XX:+**HeapDumpOnOutOfMemoryError -Xmx1024m ...
>>>>
>>>> Several consecutive thread dumps showed that Thread t at 41101 was blocked
>>>> indefinitely in ReentrantReadWriteLock. writeLock().lock().
>>>>
>>>> We know from code inspection that nothing ever takes a read lock on this
>>>> ReentrantReadWriteLock, so started trying to find out what has got its
>>>> write lock.
>>>>
>>>> The output of "jstack -l" should list which thread holds this exclusive
>>>> lock in the "locked ownable synchronizers" section but does not.
>>>>
>>>> Our first theory was that the owning thread might have terminated.
>>>>
>>>> We wrote a simple test program to explore this. We found from heap dump
>>>> analysis that even if the owning thread terminates, the lock itself
>>>> still
>>>> refers to it via the ReentrantReadWriteLock.**WriteLock.sync.
>>>> exclusiveOwnerThread field. Looking in the java.util.concurrent source
>>>> code, it seems that this field only gets null'ed when the lock is
>>>> released.
>>>>
>>>> However, looking in the heap dump taken following our "deadlock", we
>>>> were
>>>> surprised to find that the lock in question has a null
>>>> sync.exclusiveOwnerThread field.
>>>>
>>>> Surely a write lock should be in one of two states (except possibly for
>>>> a
>>>> tiny instant when its state is being non-atomically switched):
>>>>
>>>> 1) The lock is available, and sync.exclusiveOwnerThread is null 2) The
>>>> lock is unavailable, and sync.exclusiveOwnerThread is populated
>>>>
>>>> But our lock was indefinitely in this state:
>>>>
>>>> 3) The lock is unavailable and sync.exclusiveOwnerThread is null
>>>>
>>>> Does anyone know whether this represents a bug? If not, can you explain
>>>> what it means for a lock to be in this counterintuitive state?
>>>>
>>>> Thanks, Phil
>>>>
>>>> ______________________________**_________________
>>>> Concurrency-interest mailing list
>>>> Concurrency-interest at cs.**oswego.edu<Concurrency-interest at cs.oswego.edu>
>>>> http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Viktor Klang
>>>>
>>>> Akka Tech Lead
>>>> Typesafe <http://www.typesafe.com/> - The software stack for
>>>> applications
>>>> that scale
>>>>
>>>> Twitter: @viktorklang
>>>>   *_____________________________**__________________*
>>>>
>>>>  Concurrency-interest mailing list
>>>>  Concurrency-interest at cs.**oswego.edu<Concurrency-interest at cs.oswego.edu>
>>>>  http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> ______________________________**_________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.**oswego.edu <Concurrency-interest at cs.oswego.edu>
>> http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120802/a2921ee2/attachment.html>


More information about the Concurrency-interest mailing list