[concurrency-interest] DirectByteBuffers and reachabilityFence

Vitaly Davidovich vitalyd at gmail.com
Tue Dec 8 13:14:37 EST 2015


>
> My take is that current DBB doesn't work. It's just "harder" to hit its
> bugs because the racing-point safepoint is often (unintentionally but
> luckily) eliminated, and the race needed to hit the safepoint when it does
> remain in compiled code is dynamically rare.


Agreed that it's broken and doesn't work *by design*.

get(int i) {
>    long addr = ix(checkIndex(i));
>    // There is a safepoint here which may or may not be removed depending
> on
>    // unrelated compiler decisions.
>    // "this" may no longer be reachable here.
>    // If this is the last access to this buffer, and it goes out of scope
> "after" this get(),
>    // GC can trigger here, enqueue the phantom ref, and the cleaner can
> fully
>    // execute and free the associated off-heap memory at addr before the
> next
>    // line below is executed...
>    return ((unsafe.getByte(addr)))); // <—— Boom!
> }


Yes.  I'm not convinced that compiler isn't treating Unsafe operations as
an optimization barrier, precluding live range of `this` being shortened.
Is there even a single reported/observed instance of DBB segfaulting or
reading free'd memory in this case? This class and methods are very
popular, and granted most of the time you don't instantiate a DBB that
immediately goes out of scope, I'd expect this to occur out in the wild and
ensuing bug reports.

public ByteBuffer put(int i, byte x) {
>    long addr = ix(checkIndex(i));
>    // There is a safepoint here which may or may not be removed depending
> on
>    // unrelated compiler decisions.
>    // "this" may no longer be reachable here, e.g. if method is inlined.
>    // If this is the last put to this buffer, and it goes out of scope
> "after" this put(),
>    // GC can trigger here, enqueue the phantom ref, and the cleaner can
> fully
>    // execute and free the associated off-heap memory at addr before the
> next
>    // line below is executed...
>    unsafe.putByte(ix(addr, ((x)));  // <— Boom!
>    return this; // This may not be used by inlining caller, and removed as
> dead code
> }


Likewise here, I'm not sure compiler reasons about Unsafe operations so
aggressively.

public ByteBuffer put(int i, byte x) {
>    try {
>        unsafe.putByte(ix(checkIndex(i)), ((x)));
>        return this;
>    } finally {
>        Fences.reachabilityFence(this);
>    }
> }


Yes.  This also reminds me that default inlining size needs to be
reconsidered in light of changes like the above.  Putting try/finally
{reachabilityFence(this)} adds about 14 bytes of bytecode alone, which may
make some methods go over the default MaxInlineSize threshold.


On Tue, Dec 8, 2015 at 12:01 PM, Gil Tene <gil at azul.com> wrote:

> The thread started with Alexandre
> asking a question for which he then added a segfaulting example; we
>
> tried to figure out why his didn't work and DBB works.
>
>
> My take is that current DBB doesn't work. It's just "harder" to hit its
> bugs because the racing-point safepoint is often (unintentionally but
> luckily) eliminated, and the race needed to hit the safepoint when it does
> remain in compiled code is dynamically rare. To highlight by analysis,
> rather than with a reproducer, where current DBB is vulnerable, look at the
> simplest forms of get() and put(). Since both get() and put() will tend to
> [hopefully] get inlined, the safepoint positions I highlight below can
> easily remain real even after optimization. Many current cleaner-related
> (and other phantom/weak/soft ref queueing handlers that release resources)
> have similar bugs, and Fences.reachabilityFence can finally be used to fix
> them.
>
> get(int i) {
>    return ((unsafe.getByte(ix(checkIndex(i))))); // <—— bug!
> }
>
> public ByteBuffer put(int i, byte x) {
>    unsafe.putByte(ix(checkIndex(i)), ((x)));  // <—— bug!
>    return this;
> }
>
> To highlight the bugs, let me split these up into steps:
>
> get(int i) {
>    long addr = ix(checkIndex(i));
>    // There is a safepoint here which may or may not be removed depending
> on
>    // unrelated compiler decisions.
>    // "this" may no longer be reachable here.
>    // If this is the last access to this buffer, and it goes out of scope
> "after" this get(),
>    // GC can trigger here, enqueue the phantom ref, and the cleaner can
> fully
>    // execute and free the associated off-heap memory at addr before the
> next
>    // line below is executed...
>    return ((unsafe.getByte(addr)))); // <—— Boom!
> }
>
> public ByteBuffer put(int i, byte x) {
>    long addr = ix(checkIndex(i));
>    // There is a safepoint here which may or may not be removed depending
> on
>    // unrelated compiler decisions.
>    // "this" may no longer be reachable here, e.g. if method is inlined.
>    // If this is the last put to this buffer, and it goes out of scope
> "after" this put(),
>    // GC can trigger here, enqueue the phantom ref, and the cleaner can
> fully
>    // execute and free the associated off-heap memory at addr before the
> next
>    // line below is executed...
>    unsafe.putByte(ix(addr, ((x)));  // <— Boom!
>    return this; // This may not be used by inlining caller, and removed as
> dead code
> }
>
>
> With reachabilityFence, this bug can finally be fixed with e.g.:
>
> public ByteBuffer put(int i, byte x) {
>    try {
>        unsafe.putByte(ix(checkIndex(i)), ((x)));
>        return this;
>    } finally {
>        Fences.reachabilityFence(this);
>    }
> }
>
> get(int i) {
>    try {
>        return ((unsafe.getByte(ix(checkIndex(i)))));
>    } finally {
>        Fences.reachabilityFence(this);
>    }
> }
>
> On Dec 8, 2015, at 7:38 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
> If you're talking about simply observing the effects of an object being
>> collected while method invocations on that object are still in flight, see
>> this article:
>> http://www.infoq.com/articles/Fatal-Flaw-Finalizers-Phantoms
>> We have run into this issue numerous times in various situations, which
>> is why we're happy to see reachabilityFence() come into being.  So yes,
>> it's already broken.
>
>
> Nope, I'm talking about breaking existing DBB code as it stands today.
> Alexandre already posted an example of where similar but subtly different
> (with respect to DBB) code breaks, so I'm well aware it's a problem.
>
> On Tue, Dec 8, 2015 at 10:22 AM, David M. Lloyd <david.lloyd at redhat.com>
> wrote:
>
>> On 12/08/2015 08:40 AM, Vitaly Davidovich wrote:
>>
>>>     The lifetime, natural or otherwise, of an instance does not survive
>>>     until an instance method returns because, a lot of the time, that
>>>     instance method is inlined.
>>>
>>>
>>> You're talking about optimization here (inlining); by "natural" I meant
>>> the naive/no optimization case (e.g. interpreter, debugger attached
>>> w/breakpoint in method, etc).
>>>
>>>     It's not just HotSpot, though: some VMs are even more aggressive, and
>>>
>>> Which java VMs are these? Just curious.
>>>
>>>     we have seen finalizers executed even before constructors have
>>>     completed.  And that is allowed by the specification.
>>>
>>>
>>> Ok, but that's beside the point, really.  Surely if compiler can
>>> optimize and arrange for liveness to allow for it, then it's a good
>>> thing it does that.  My point isn't that this cannot happen due to spec,
>>> but rather that in places like DBB where `this` is used after the Unsafe
>>> call the  compiler has to schedule things differently in order to reduce
>>> lifetime.  And my point is that compilers generally tend to be cautious
>>> in doing things that may break code.  This is the practical aspect we
>>> were referring to - it's actual humans writing these optimizations, and
>>> they're sensitive to breaking code, particularly in java.
>>> Theoretically, yes, anything is possible.
>>>
>>>     It's already broken.
>>>
>>>
>>> Sure.  Now try to submit a patch to Hotspot that will break this case,
>>> even if allowed by spec, and see how far you get :).
>>>
>>
>> If you're talking about simply observing the effects of an object being
>> collected while method invocations on that object are still in flight, see
>> this article:
>>
>> http://www.infoq.com/articles/Fatal-Flaw-Finalizers-Phantoms
>>
>> We have run into this issue numerous times in various situations, which
>> is why we're happy to see reachabilityFence() come into being.  So yes,
>> it's already broken.
>> --
>> - DML
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20151208/6ae9d4ed/attachment.html>


More information about the Concurrency-interest mailing list