[concurrency-interest] The JSR-133 Cookbook for Compiler Writers
davidcholmes at aapt.net.au
Mon Nov 28 18:11:18 EST 2011
I may be mixing terminology here so let me clarify, when I say "release" I'm
assuming an action such that given:
p = x;
q = y;
then if you see q==y you are guaranteed to see p==x. Depending on the
platform release() may need to be a full memory synchronization instruction,
or a no-op.
In terms of the object allocation issue, constructing an object is a two
stage process even at the bytecode level:
- allocate the object
- invoke the constructor
So logically we need the following sequence:
- allocate (and zero) memory
- initialize object header etc
- invoke constructor
- if (wrote_final_field)
we need the release() after the object header initialization because at that
point the object can become visible to the GC and so must be seen to be
Now the JIT could inline all the above and maybe figure out how to remove
one release(), but presently in hotspot the allocation and construction
paths are quite distinct.
As release() is a no-op on x86 and sparc, you will not find explicit
release() actions in all of the current hotspot code paths - something we
will look at fixing.
> -----Original Message-----
> From: Boehm, Hans [mailto:hans.boehm at hp.com]
> Sent: Tuesday, 29 November 2011 5:06 AM
> To: dholmes at ieee.org; Andrew Haley; Nathan Reynolds
> Cc: concurrency-interest at cs.oswego.edu
> Subject: RE: [concurrency-interest] The JSR-133 Cookbook for Compiler
> > From: David Holmes [mailto:davidcholmes at aapt.net.au]
> > Hi Hans,
> > Hans Boehm writes:
> > > How is the method table pointer any different from any other final
> > > field here? This code looks like the fence is not generated if the
> > > method table pointer is written, but there are no other final fields.
> > > I can't at the moment think of a way to defend that choice.
> > A Java object contains a reference to its class, which in turn
> holds the vtable
> > pointer. That class reference may be stored using "release
> semantics". I say
> > "may" because the code is somewhat complex and its hard to know exactly
> > what paths will be executed just be reading the code.
> Thanks for the answer, but I remain confused. Consider a
> weakly-ordered platform like ARM or PowerPC, and the allocation
> of an object p containing no final fields. The generated
> sequence seems to look something like (in C syntax):
> p = pointer_to_newly_allocated_memory;
> p -> class_ptr = ptr_to_class; // release operation ensures that
> prior accesses become visible earlier
> maybe other initialization;
> // p is ready to use
> Assume this is done in thread 1, where user code then stores p
> into a global q, and thread 2 calls q -> foo() (which involves a
> racy read of q). I see nothing ensuring that the store to p ->
> class becomes visible before the store to q. Using a release
> store for the first one only orders it after preceding accesses,
> such as the initialization of the class object. Without such
> ordering, thread 2 can see the updated value of q, without
> seeing the correct class_ptr value, potentially resulting in many
> serious problems.
> Or did you mean that there is another, separate, fence AFTER the
> class_ptr assignment? That works, but it seems to me that should
> often be combinable with the one for final fields?
> > From a practical perspective this is only an issue for non-TSO
> systems when
> > the object reference is subject to unsafe publication. Even for non-TSO
> > systems the "distance" between the two stores makes it unlikely
> (and no I
> > can't quantify that) they will be reordered.
> True, but relying on the latter seems like a really bad idea. I
> suspect that if this is really implemented incorrectly, the main
> reason nobody has noticed is that, like all these things, it
> works just fine in the absence of data races, which people
> already correctly avoid most of the time.
> I'm pushing on this a bit, because I'm trying to understand
> exactly how broken the memory model story currently is in the
> presence of data races. The more broken or needlessly expensive
> it is, the better our chances of making a drastic change to fix things :-)
> > The constructor executes after that, so any final field
> assignments there
> > need their own release barrier.
> I don't think that's the right way to think about it. Turning
> the field assignments into release stores doesn't help. You need
> to turn the racing publication (the assignment to q in the
> example) into a release store, but that publication may be very
> far away from the class_ptr or final field assignments. I don't
> see a way to do this except with an essentially unconditional
> fence (lwsync on PowerPC) at the end of the constructor, and if
> you can't preclude unsafe publication, probably another one after
> the class_ptr store.
More information about the Concurrency-interest