[concurrency-interest] Array allocation and access on the JVM

Vitaly Davidovich vitalyd at gmail.com
Fri Jan 27 12:01:48 EST 2012


I thought UseCondMark just adds a branch to test whether the mark is
already dirty instead of writing unconditionally - are you saying the
branch would cause noticeable perf degradation?

Vitaly

Sent from my phone
On Jan 27, 2012 11:38 AM, "Nathan Reynolds" <nathan.reynolds at oracle.com>
wrote:

>  Defaulting -XX:+UseCondCardMark would require a lot of testing to create
> a heuristic to guess when it should default on or off.  Not all workloads
> hit contention on the card table.  Making the card table concurrent for
> these workloads would slow them down.  If the heuristic sub-optimal, then
> we would be in a worse situation than we are today.
>
> In order for the JVM to detect contention, it would need to profile each
> access or have a hardware counter to declare contention.  Timing each
> access and tracking statistics would be expensive.  Hardware counters exist
> but they are too expensive to access because of kernel round trips.  So,
> the current solutions are more expensive than the actual problem.
>
> We have asked hardware vendors to supply user-mode access to hardware
> counters for true and false sharing.  The same hardware counters can be
> used for this problem.  If we get the user-mode hardware counters, then
> this might be automatize.
>
> Another difficulty would be in making this flag dynamically changeable *
> without* incurring a performance penalty.
>
> Nathan Reynolds<http://psr.us.oracle.com/wiki/index.php/User:Nathan_Reynolds>| Consulting Member of Technical Staff |
> 602.333.9091
> Oracle PSR Engineering <http://psr.us.oracle.com/> | Server Technology
>
> On 1/26/2012 10:36 PM, Hanson Char wrote:
>
> Is there plan to have -XX:+UseCondCardMark included as the default config
> in Java 7 ?  Or can a VM auto detect the need for such configuration and
> self config upon start up ?
>
>  Thanks,
> Hanson
>
> On Wed, Jan 25, 2012 at 3:55 PM, Doug Lea <dl at cs.oswego.edu> wrote:
>
>> On 01/25/12 14:48, Aleksandar Prokopec wrote:
>>
>>> Hello,
>>>
>>> I've recently posted the following question about memory contention on
>>> StackOverflow:
>>>
>>
>>
>>  I asked Aleksandar offlist to check the effects of -XX:+UseCondCardMark
>> which as always these days was the main problem. Really, everyone
>> should always use this switch on hotspot on multiprocessors.
>> (or -XX:+UseG1GC, although it is often a little worse on throughput
>> but better-behaved for large heaps).
>>
>> As I mentioned in a post from a few weeks ago, see Dive Dice's blog
>> http://blogs.oracle.com/dave/entry/false_sharing_induced_by_card
>>
>>
>> -Doug
>>
>>
>>
>>
>>
>>
>>>
>>> http://stackoverflow.com/questions/8942396/array-allocation-and-access-on-the-java-virtual-machine-and-memory-contention
>>>
>>> Basically, I have several threads defined like this:
>>>
>>> final class Worker extends Thread {
>>> Foo[] array = new Foo[1024];
>>> int sz;
>>>
>>> public Worker(int _sz) {
>>> sz = _sz;
>>> }
>>>
>>> public void run() {
>>> //Foo[] arr = new Foo[1024];
>>> Foo[] arr = array;
>>> loop(arr);
>>> }
>>>
>>> public void loop(Foo[] arr) {
>>> int i = 0;
>>> int pos = 512;
>>> Foo v = new Foo();
>>> while (i < sz) {
>>> if (i % 2 == 0) {
>>> arr[pos] = v;
>>> pos += 1;
>>> } else {
>>> pos -= 1;
>>> v = arr[pos];
>>> }
>>> i++;
>>> }
>>> }
>>> }
>>>
>>> The program takes 2 parameters - `size` and `par`. `size` is the size of
>>> the
>>> workload, and `par` is the number of threads to distribute the workload
>>> to.
>>> What each thread does is repetitively write and read an array location
>>> 512, in
>>> total `size`/ `par` times. Each thread reads its own array (locations
>>> being
>>> written to should be at least 1024*4 bytes away from each other).
>>>
>>> I've measured the speedup on an 2x 4-core Intel Xeon server and observed
>>> the
>>> following speedups for 1, 2, 4 and 8 processors (7 repetitions each):
>>>
>>>  >>> All running times: [2149, 2227, 1974, 1948, 1803, 2283, 1878]
>>>  >>> All running times: [1140, 1124, 2022, 1141, 2028, 2004, 2136]
>>>  >>> All running times: [867, 1022, 1457, 1342, 1436, 966, 1531]
>>>  >>> All running times: [915, 864, 1245, 1243, 948, 790, 1007]
>>>
>>> Apparently, this is due to memory contention.
>>> What I've observed is that if I comment the line `Foo[] arr = array` and
>>> uncomment the line above it to allocate the array within the thread, I
>>> get:
>>>
>>>  >>> All running times: [2053, 1966, 2089, 1937, 2046, 1909, 2011]
>>>  >>> All running times: [1048, 1178, 1100, 1194, 1367, 1271, 1207]
>>>  >>> All running times: [578, 508, 589, 571, 617, 643, 645]
>>>  >>> All running times: [330, 299, 300, 322, 331, 324, 575]
>>>
>>> My guess is that in the second case the arrays get allocated in thread
>>> local
>>> allocation buffers, so the array positions are much further from each
>>> other.
>>> Still, even in the first case processors should be writing to separate
>>> cache
>>> lines, so why does the first example scale so badly?
>>>
>>> Can anyone explain these numbers? Why does this memory contention happen?
>>>
>>> Thank you,
>>> Aleksandar Prokopec
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> // The complete runnable program:
>>>
>>>
>>> import java.util.ArrayList;
>>>
>>> class MultiStackJavaExperiment {
>>>
>>> final class Foo {
>>> int x = 0;
>>> }
>>>
>>> final class Worker extends Thread {
>>> Foo[] array = new Foo[1024];
>>> int sz;
>>>
>>> public Worker(int _sz) {
>>> sz = _sz;
>>> }
>>>
>>> public void run() {
>>> Foo[] arr = new Foo[1024];
>>> //Foo[] arr = array;
>>> loop(arr);
>>> }
>>>
>>> public void loop(Foo[] arr) {
>>> int i = 0;
>>> int pos = 512;
>>> Foo v = new Foo();
>>> while (i < sz) {
>>> if (i % 2 == 0) {
>>> arr[pos] = v;
>>> pos += 1;
>>> } else {
>>> pos -= 1;
>>> v = arr[pos];
>>> }
>>> i++;
>>> }
>>> }
>>> }
>>>
>>> public static void main(String[] args) {
>>> (new MultiStackJavaExperiment()).mainMethod(args);
>>> }
>>>
>>> int size = Integer.parseInt(System.getProperty("size"));
>>> int par = Integer.parseInt(System.getProperty("par"));
>>>
>>> public void mainMethod(String[] args) {
>>> int times = 0;
>>> if (args.length == 0) times = 1;
>>> else times = Integer.parseInt(args[0]);
>>> ArrayList < Long > measurements = new ArrayList < Long > ();
>>>
>>> for (int i = 0; i < times; i++) {
>>> long start = System.currentTimeMillis();
>>> run();
>>> long end = System.currentTimeMillis();
>>>
>>> long time = (end - start);
>>> System.out.println(i + ") Running time: " + time + " ms");
>>> measurements.add(time);
>>> }
>>>
>>> System.out.println(">>>");
>>> System.out.println(">>> All running times: " + measurements);
>>> System.out.println(">>>");
>>> }
>>>
>>> public void run() {
>>> int sz = size / par;
>>> ArrayList < Thread > threads = new ArrayList < Thread > ();
>>>
>>> for (int i = 0; i < par; i++) {
>>> threads.add(new Worker(sz));
>>> threads.get(i).start();
>>> }
>>> for (int i = 0; i < par; i++) {
>>> try {
>>> threads.get(i).join();
>>> } catch (Exception e) {}
>>> }
>>> }
>>>
>>> }
>>>
>>>
>>>
>>>   _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>
>
>
> _______________________________________________
> Concurrency-interest mailing listConcurrency-interest at cs.oswego.eduhttp://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20120127/243e91b9/attachment.html>


More information about the Concurrency-interest mailing list