[concurrency-interest] Java Memory Model and ParallelStream

Luke Hutchison luke.hutch at gmail.com
Fri Mar 6 05:40:34 EST 2020


Reposting a question I posted to jdk-dev concerning the non-volatility of
array elements in Java, and assumptions that can be made about the
synchronization barrier at the end of a parallel stream. Any insight into
this would be appreciated.


---------- Forwarded message ---------
From: Luke Hutchison <luke.hutch at gmail.com>
Date: Thu, Mar 5, 2020 at 6:03 PM
Subject: Java memory model question
To: jdk-dev <jdk-dev at openjdk.java.net>


Under the Java memory model, is it fair to assume that memory reads and
writes can only be reorderered within a method, but not across method
boundaries? (Define "method" here as what's left after any inlining has
taken place.)

Specifically I'm wondering: if a thread X launches a parallel stream that
writes at most once to each independent element of an array, can it be
assumed that when the stream processing ends, X will always read the value
of all written array elements? In other words, can the termination of the
stream be seen as a memory ordering barrier (in a weak sense)?

I'm not asking whether the following code is advisable, only whether
there's any chance of the main thread reading an old value from the array.

    int N = 50;
    String[] strValues = new String[N];
    IntStream.range(0, N)
            .parallel()
            .forEach(i -> strValues[i] = Integer.toString(i));
    // (*) Are the new strValues[i] all guaranteed to be visible here?
    for (String strValue : strValues) {
        System.out.println(strValue);
    }




---------- Forwarded message ---------
From: David Holmes <david.holmes at oracle.com>
Date: Thu, Mar 5, 2020 at 7:09 PM
Subject: Re: Java memory model question
To: Luke Hutchison <luke.hutch at gmail.com>, jdk-dev <jdk-dev at openjdk.java.net
>


Hi Luke,

Probably a question better asked on concurrency-interest at cs.oswego.edu

On 6/03/2020 11:03 am, Luke Hutchison wrote:
> Under the Java memory model, is it fair to assume that memory reads and
> writes can only be reorderered within a method, but not across method
> boundaries? (Define "method" here as what's left after any inlining has
> taken place.)

No. Theoretically you could inline the entire program into a single
"method". Method entry/exit don't in themselves define synchronization
points.

> Specifically I'm wondering: if a thread X launches a parallel stream that
> writes at most once to each independent element of an array, can it be
> assumed that when the stream processing ends, X will always read the value
> of all written array elements? In other words, can the termination of the
> stream be seen as a memory ordering barrier (in a weak sense)?

I would have expected this to be explicitly stated somewhere in the
streams documentation, but I don't see it. My expectation is that
terminal operations would act as synchronization points.

> I'm not asking whether the following code is advisable, only whether
> there's any chance of the main thread reading an old value from the array.
>
>      int N = 50;
>      String[] strValues = new String[N];
>      IntStream.range(0, N)
>              .parallel()
>              .forEach(i -> strValues[i] = Integer.toString(i));
>      // (*) Are the new strValues[i] all guaranteed to be visible here?
>      for (String strValue : strValues) {
>          System.out.println(strValue);
>      }

I would expect that code to be fine. parallel() would not be usable
otherwise.

Cheers,
David


---------- Forwarded message ---------
From: Brian Goetz <brian.goetz at oracle.com>
Date: Fri, Mar 6, 2020 at 12:46 AM
Subject: Re: Java memory model question
To: Luke Hutchison <luke.hutch at gmail.com>
Cc: jdk-dev <jdk-dev at openjdk.java.net>


No, but  this is a common myth.  Method boundaries are not part of the JMM,
and the JIT regularly makes optimizations that have the effect of
reordering operations across method boundaries.


---------- Forwarded message ---------
From: Luke Hutchison <luke.hutch at gmail.com>
Date: Fri, Mar 6, 2020 at 3:27 AM
Subject: Re: Java memory model question
To: Brian Goetz <brian.goetz at oracle.com>
Cc: jdk-dev <jdk-dev at openjdk.java.net>


On Fri, Mar 6, 2020 at 12:46 AM Brian Goetz <brian.goetz at oracle.com> wrote:

> No, but  this is a common myth.  Method boundaries are not part of the
> JMM, and the JIT regularly makes optimizations that have the effect of
> reordering operations across method boundaries.
>

Thanks. That's pretty interesting, but I can't think of an optimization
that would have that effect. Can you give an example?

On Thu, Mar 5, 2020 at 7:09 PM David Holmes <david.holmes at oracle.com> wrote:

> Probably a question better asked on concurrency-interest at cs.oswego.edu


Thanks, I didn't know about that list.

> can the termination of the
> > stream be seen as a memory ordering barrier (in a weak sense)?
>
> I would have expected this to be explicitly stated somewhere in the
> streams documentation, but I don't see it. My expectation is that
> terminal operations would act as synchronization points.
>

Right, although I wasn't asking about "high-level concurrency" (i.e.
coordination between threads), but rather "low-level concurrency" (memory
operation ordering). The question arises from the Java limitation that
fields can be marked volatile, but if the field is of array type, then the
individual elements of the array cannot be marked volatile. There's no
"element-wise volatile" array unless you resort to using an
AtomicReferenceArray, which creates a wrapper object per array element,
which is wasteful on computation and space.

I understand that the lack of "element-wise volatile" arrays means that
threads can end up reading stale values if two or more threads are reading
from and writing to the same array elements. However for this example, I
specifically exclude that issue by ensuring that there's only ever either
zero readers / one writer, or any number of readers / zero writers (every
array element is only written once by any thread, then after the end of the
stream, there are zero writers).

I'm really just asking if there is some "macro-scale memory operation
reordering" that could somehow occur across the synchronization barrier at
the end of the stream. I don't know how deep the rabbit hole of memory
operation reordering goes.

I have to assume this is not the case, because the worker threads should
all go quiescent at the end of the stream, so should have flushed their
values out to at least L1 cache, and the CPU should ensure cache coherency
between all cores beyond that point. But I want to make sure that can be
guaranteed.

In practice I have never seen this pattern fail, and it's exceptionally
useful to be able to write to disjoint array elements from an
IntStream.range(0, N) parallel stream, particularly as a pattern to very
quickly parallelize orignially-serial code to have maximum efficiency, by
simply replacing for loops that have no dependencies between operations
with parallel streams -- but I have been nervous to use this pattern since
I realized that arrays cannot have volatile elements. Logically my brain
tells me the fear is unfounded, but I wanted to double check.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20200306/c6adca82/attachment.htm>


More information about the Concurrency-interest mailing list