[concurrency-interest] Java Memory Model and ParallelStream

Luke Hutchison luke.hutch at gmail.com
Fri Mar 6 06:11:58 EST 2020


On Fri, Mar 6, 2020 at 3:51 AM Aleksey Shipilev <shade at redhat.com> wrote:

> On 3/6/20 11:40 AM, Luke Hutchison via Concurrency-interest wrote:
> > Thanks. That's pretty interesting, but I can't think of an optimization
> that would have that effect.
> > Can you give an example?
>
> Method gets inlined, and boom: optimizer does not even see the method
> boundary.
>

...which is why I specifically excluded inlining in my original question
(or said consider the state after all inlining has taken place). I realize
that inlining doesn't just happen at compiletime, and the JIT could decide
at any point to inline a function, but I want to ignore that (very real)
possibility to understand whether reordering can take place across method
boundaries _if inlining never happens_. Brian Goetz commented that "JIT
regularly makes optimizations that have the effect of reordering operations
across method boundaries" -- so I think the answer is yes. I just don't
understand how that would happen.

> There's no "element-wise volatile" array unless you resort to using an
> AtomicReferenceArray,
> > which creates a wrapper object per array element, which is wasteful on
> computation and space.
>
> Not really related to this question, but: VarHandles provide "use-site"
> volatility without
> "def-site" volatility. In other words, you can access any non-volatile
> element as if it is volatile.
>

Thanks for the pointer, although if you need to create one VarHandle per
array element to guarantee this behavior, then that's logically no
different than wrapping each array element in a wrapper object with
AtomicReferenceArray.

(Maybe Java could provide something like a "volatile volatile" type that
could be used with array-typed fields to make "volatile" apply to elements
of an array-typed field, not just to the field itself?)

> I have to assume this is not the case, because the worker threads should
> all go quiescent at the end
> > of the stream, so should have flushed their values out to at least L1
> cache, and the CPU should
> > ensure cache coherency between all cores beyond that point. But I want
> to make sure that can be
> > guaranteed.
>
> Stop thinking in low level? That would only confuse you.
>
> Before trying to wrap your head around Streams, consider the plain thread
> pool:
>
>     ExecutorService e = Executors.newFixedThreadPool(1);
>     int[] a = new int[1];
>     Future<?> f = e.submit(() -> a[0]++);
>     f.get();
>     System.out.println(a[0]); // guaranteed to print "1".
>
> This happens because all actions in the worker thread (so all writes in
> lambda body) happen-before
> all actions after result acquisition (so all reads after Future.get).
> Parallel streams carry the
> similar property.


Good example, and I guess the "guaranteed" here answers my question.

I guess fundamentally I was asking if any memory reordering (or cache
staleness) can happen across synchronization barriers. It sounds like that
is not the case, due to synchronization barriers implementing a
computational "happens-before" guarantee, which enforces the same
"happens-before" total ordering on memory operations across the barrier.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20200306/dd6757e8/attachment.htm>


More information about the Concurrency-interest mailing list