[concurrency-interest] Array allocation and access on the JVM

Vitaly Davidovich vitalyd at gmail.com
Wed Jan 25 19:13:32 EST 2012


I must say that's a really good hypothesis and would perfectly explain the
observations by Aleksandar- nice one Doug! :)

TLAB sizes are adjusted ergonomically (if not set via flag) as I understand
it, so even though it doesn't really make sense to have them smaller than
32k, they're masking away this issue.

Sent from my phone
On Jan 25, 2012 6:57 PM, "Doug Lea" <dl at cs.oswego.edu> wrote:

> On 01/25/12 14:48, Aleksandar Prokopec wrote:
>
>> Hello,
>>
>> I've recently posted the following question about memory contention on
>> StackOverflow:
>>
>
>
> I asked Aleksandar offlist to check the effects of -XX:+UseCondCardMark
> which as always these days was the main problem. Really, everyone
> should always use this switch on hotspot on multiprocessors.
> (or -XX:+UseG1GC, although it is often a little worse on throughput
> but better-behaved for large heaps).
>
> As I mentioned in a post from a few weeks ago, see Dive Dice's blog
> http://blogs.oracle.com/dave/**entry/false_sharing_induced_**by_card<http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card>
>
>
> -Doug
>
>
>
>
>
>
>> http://stackoverflow.com/**questions/8942396/array-**
>> allocation-and-access-on-the-**java-virtual-machine-and-**
>> memory-contention<http://stackoverflow.com/questions/8942396/array-allocation-and-access-on-the-java-virtual-machine-and-memory-contention>
>>
>> Basically, I have several threads defined like this:
>>
>> final class Worker extends Thread {
>> Foo[] array = new Foo[1024];
>> int sz;
>>
>> public Worker(int _sz) {
>> sz = _sz;
>> }
>>
>> public void run() {
>> //Foo[] arr = new Foo[1024];
>> Foo[] arr = array;
>> loop(arr);
>> }
>>
>> public void loop(Foo[] arr) {
>> int i = 0;
>> int pos = 512;
>> Foo v = new Foo();
>> while (i < sz) {
>> if (i % 2 == 0) {
>> arr[pos] = v;
>> pos += 1;
>> } else {
>> pos -= 1;
>> v = arr[pos];
>> }
>> i++;
>> }
>> }
>> }
>>
>> The program takes 2 parameters - `size` and `par`. `size` is the size of
>> the
>> workload, and `par` is the number of threads to distribute the workload
>> to.
>> What each thread does is repetitively write and read an array location
>> 512, in
>> total `size`/ `par` times. Each thread reads its own array (locations
>> being
>> written to should be at least 1024*4 bytes away from each other).
>>
>> I've measured the speedup on an 2x 4-core Intel Xeon server and observed
>> the
>> following speedups for 1, 2, 4 and 8 processors (7 repetitions each):
>>
>>  >>> All running times: [2149, 2227, 1974, 1948, 1803, 2283, 1878]
>>  >>> All running times: [1140, 1124, 2022, 1141, 2028, 2004, 2136]
>>  >>> All running times: [867, 1022, 1457, 1342, 1436, 966, 1531]
>>  >>> All running times: [915, 864, 1245, 1243, 948, 790, 1007]
>>
>> Apparently, this is due to memory contention.
>> What I've observed is that if I comment the line `Foo[] arr = array` and
>> uncomment the line above it to allocate the array within the thread, I
>> get:
>>
>>  >>> All running times: [2053, 1966, 2089, 1937, 2046, 1909, 2011]
>>  >>> All running times: [1048, 1178, 1100, 1194, 1367, 1271, 1207]
>>  >>> All running times: [578, 508, 589, 571, 617, 643, 645]
>>  >>> All running times: [330, 299, 300, 322, 331, 324, 575]
>>
>> My guess is that in the second case the arrays get allocated in thread
>> local
>> allocation buffers, so the array positions are much further from each
>> other.
>> Still, even in the first case processors should be writing to separate
>> cache
>> lines, so why does the first example scale so badly?
>>
>> Can anyone explain these numbers? Why does this memory contention happen?
>>
>> Thank you,
>> Aleksandar Prokopec
>>
>>
>>
>> ------------------------------**------------------------------**
>> ------------------
>>
>> // The complete runnable program:
>>
>>
>> import java.util.ArrayList;
>>
>> class MultiStackJavaExperiment {
>>
>> final class Foo {
>> int x = 0;
>> }
>>
>> final class Worker extends Thread {
>> Foo[] array = new Foo[1024];
>> int sz;
>>
>> public Worker(int _sz) {
>> sz = _sz;
>> }
>>
>> public void run() {
>> Foo[] arr = new Foo[1024];
>> //Foo[] arr = array;
>> loop(arr);
>> }
>>
>> public void loop(Foo[] arr) {
>> int i = 0;
>> int pos = 512;
>> Foo v = new Foo();
>> while (i < sz) {
>> if (i % 2 == 0) {
>> arr[pos] = v;
>> pos += 1;
>> } else {
>> pos -= 1;
>> v = arr[pos];
>> }
>> i++;
>> }
>> }
>> }
>>
>> public static void main(String[] args) {
>> (new MultiStackJavaExperiment()).**mainMethod(args);
>> }
>>
>> int size = Integer.parseInt(System.**getProperty("size"));
>> int par = Integer.parseInt(System.**getProperty("par"));
>>
>> public void mainMethod(String[] args) {
>> int times = 0;
>> if (args.length == 0) times = 1;
>> else times = Integer.parseInt(args[0]);
>> ArrayList < Long > measurements = new ArrayList < Long > ();
>>
>> for (int i = 0; i < times; i++) {
>> long start = System.currentTimeMillis();
>> run();
>> long end = System.currentTimeMillis();
>>
>> long time = (end - start);
>> System.out.println(i + ") Running time: " + time + " ms");
>> measurements.add(time);
>> }
>>
>> System.out.println(">>>");
>> System.out.println(">>> All running times: " + measurements);
>> System.out.println(">>>");
>> }
>>
>> public void run() {
>> int sz = size / par;
>> ArrayList < Thread > threads = new ArrayList < Thread > ();
>>
>> for (int i = 0; i < par; i++) {
>> threads.add(new Worker(sz));
>> threads.get(i).start();
>> }
>> for (int i = 0; i < par; i++) {
>> try {
>> threads.get(i).join();
>> } catch (Exception e) {}
>> }
>> }
>>
>> }
>>
>>
>>
>> ______________________________**_________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.**oswego.edu <Concurrency-interest at cs.oswego.edu>
>> http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>>
>
> ______________________________**_________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.**oswego.edu <Concurrency-interest at cs.oswego.edu>
> http://cs.oswego.edu/mailman/**listinfo/concurrency-interest<http://cs.oswego.edu/mailman/listinfo/concurrency-interest>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120125/1ddfbb1f/attachment-0001.html>


More information about the Concurrency-interest mailing list