[concurrency-interest] Thoughts about LongAdder

Doug Lea dl at cs.oswego.edu
Thu Aug 18 16:41:17 EDT 2011


Sorry for the delay on this!

On 08/10/11 12:37, Nathan Reynolds wrote:
> Here's a list of ideas of where
> LongAdder and ConcurrentCounter differ.
>
>    1. I would recommend striping the counter *not* by thread but by
>       processor/core. This will result in almost 0 contention and significantly
>       reduce cache coherence traffic.

This would be a good idea if we had a ProcessorLocal class.
Which is probably a good idea. Here's a minimal sketch
Add
   class java.lang.Processor {
      int getId();
      int getMaxId();
      static int proximity(Processor a, Processor b); // metrics TBD
      // non-public ProcessorLocal table
   }
Add to class java.lang.Thread:
       Processor getProcessor();
Add
    class java.lang.ProcessorLocal {
      // mostly like ThreadLocal
   }

However, in the mean time, the internal randomization done
inside LongAdder gives performance that seems to closely approximate
what you might get by doing this.


>    2. LongAdder pads each Cell to avoid false sharing. Good idea. However, the
>       amount of memory required can be quite costly in terms of capacity,
>       wasting cache space and bandwidth. ConcurrentCounter (attached) assumes
>       the JVM allocator is NUMA aware.

I agree that padding is a bit hacky, but I don't know how a JVM
can know to do this without either the assertion of or evidence
of a variable being contended. The best suggestion I've heard on
this is to add an annotation @Contended.


>       I am interested in results showing padding vs no padding...
>       assuming all of the JVM requirements are met.

I see more than a factor of 6 slowdown on Intel i7 and AMD 6172 test machines.

>    3. reset() should replace the array of Cell instead of setting the values to
>       0. This will increase GC and allocator load but reset() can then be
>       atomic.

Thanks for the suggestion, but I don't know of use cases for which
reset(), but not sum(), need to be atomic -- the only use cases I
know are to reset when it is otherwise known that threads are quiescent.

Doing this would also disable the nice property of current LongAdder
that it allocates no extra space until contended but instead uses
"base" field.

>    5. ConcurrentCounter supports a set() API. It basically does the reset()
>       trick I mentioned above but the new set of "Cells" is already assigned the
>       given value.

Ditto.


>    7. retryUpdate() is extremely complicated code (due to thread striping and
>       balancing).


We do the complicated stuff so you don't have to  :-)

-Doug


More information about the Concurrency-interest mailing list