[concurrency-interest] RFR [8011215] optimization of CopyOnWriteArrayList.addIfAbsent()

Martin Buchholz martinrb at google.com
Mon Apr 8 13:45:01 EDT 2013


It would be nice if someone somewhere could confirm that addIfAbsent was
frequently called with element present on their codebase.

But even without any such guidance, I think it makes sense to optimize for
operations not needing to update, for both addIfAbsent and remove(Object).
 Operations that do are read only and don't acquire the lock have infinite
scalability, while the penalty for having everything go wrong while having
to do a second traversal is typically small, and at worst a factor of two.

Another way to look at it is that COW data structures should be used in a
read-mostly fashion.  So optimize assuming that operations like addIfAbsent
that may or may not be read-only, are in fact read-only.



On Fri, Apr 5, 2013 at 11:31 PM, Stanimir Simeonoff <stanimir at riflexo.com>wrote:

> I was thinking exactly the same as Martin Buhholz however I could not find
> any single use in the our code where it'd be profitable to check outside
> the lock optimistically. addIfAbsent is fairly used in the form of COWSet.
> i.e. adds at the end, random removals and some COWSets may contain a few
> thousands entries. However it's virtually guaranteed addIfAbsent to return
> true, while remove(E) may be racily called and return false, i.e. an
> optimistic approach to remove() might be beneficial.
>
> On Sat, Apr 6, 2013 at 2:35 AM, Vitaly Davidovich <vitalyd at gmail.com>wrote:
>
>> Personally, I don't see a reason to further optimize for cache hits - are
>> we really saying that's the common usage of COWAL? I'd find that hard to
>> believe.  At some point, it's probably better to simply externalize this
>> policy via constructor hint or subclass or whatever rather than catering to
>> both sides in shared code.
>>
>> Having c-tor hint would require to propagate it would COWSet too. IMO,
> addIfAbsent is mostly used by COWSet not COWList itself, as it doesn't
> require the concrete type to be declared rather than the interface.
> Also that will not affect the existing code bases and very few would
> actually know the proper use case or the use cases would remain the same.
> If the element is likely to already exists and the users are aware of the
> current implementation it's likely to be guarded by contains(), such cases
> require refactoring to take benefit and the code would require new java
> version - unlikely to happen for large projects.
> If there is higher contention and smaller array-sizes busy waiting w/ some
> backoff might be even better.
>
> Stanimir
>
>> Sent from my phone
>> On Apr 5, 2013 5:28 PM, "Martin Buchholz" <martinrb at google.com> wrote:
>>
>>> I'm still advocating an optimistic approach, that does not hold the lock
>>> during the first traversal of the array snapshot.  This is much faster
>>> when
>>> cache hits are the norm (or equals methods are expensive), which I would
>>> hope would be the common case, and only slightly slower when all adds are
>>> cache misses.
>>>
>>>      public boolean addIfAbsent(E e) {
>>>         Object[] snapshot = getArray();
>>>         int len = snapshot.length;
>>>         for (int i = 0; i < len; i++)
>>>             if (eq(e, snapshot[i]))
>>>                 return false;
>>>         return addIfAbsent(e, snapshot);
>>>     }
>>>
>>>     private boolean addIfAbsent(E e, Object[] snapshot) {
>>>         final ReentrantLock lock = this.lock;
>>>         lock.lock();
>>>         try {
>>>             Object[] current = getArray();
>>>             int len = current.length;
>>>             if (snapshot != current) {
>>>                 // Optimize for contending with another addIfAbsent
>>>                 int common = Math.min(snapshot.length, len);
>>>                 for (int i = 0; i < common; i++)
>>>                     if (current[i] != snapshot[i] && eq(e, current[i]))
>>>                         return false;
>>>                 for (int i = common; i < len; i++)
>>>                     if (eq(e, current[i]))
>>>                         return false;
>>>             }
>>>             Object[] newElements = Arrays.copyOf(current, len + 1);
>>>             newElements[len] = e;
>>>             setArray(newElements);
>>>             return true;
>>>         } finally {
>>>             lock.unlock();
>>>         }
>>>     }
>>>
>>>
>>>
>>> On Fri, Apr 5, 2013 at 5:27 AM, Doug Lea <dl at cs.oswego.edu> wrote:
>>>
>>> > On 04/03/13 06:35, Doug Lea wrote:
>>> >
>>> >> This was designed to perform best in the case of possibly contended
>>> >> updates when the element is absent, by avoiding retraversal, and
>>> >> thus minimizing lock hold times in the common case. (When not common,
>>> >> it can be guarded by a contains check.) However even in this case,
>>> >> it is possible that a retraversal via arraycopy could be faster
>>> >> because it can use optimized cheaper writes (fewer card marks).
>>> >>
>>> >
>>> > Yes, by a little.
>>> > A simple but reliable performance test is now at
>>> >
>>> http://gee.cs.oswego.edu/cgi-**bin/viewcvs.cgi/jsr166/src/**test/loops/**
>>> > COWALAddIfAbsentLoops.java?**view=log<
>>> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/loops/COWALAddIfAbsentLoops.java?view=log
>>> >
>>>
>>> >
>>> > The simplest change allowing this (below) also appears to
>>> > be among the fastest. Running across various machines and
>>> > settings (GC, client/server), it seems to be between 5% and 15%
>>> > faster. This is a smaller difference than in Ivan's tests,
>>> > that didn't include lock and contention effects.
>>> >
>>> > I committed jsr166 version. We'll need to sync this up with
>>> > with openjdk tl someday, but might as well wait until
>>> > other updates for Spliterators/streams are ready to integrate.
>>> >
>>> > -Doug
>>> >
>>> > *** CopyOnWriteArrayList.java.~1.**100.~  Tue Mar 12 19:59:08 2013
>>>
>>> > --- CopyOnWriteArrayList.java   Fri Apr  5 08:03:29 2013
>>> > ***************
>>> > *** 579,595 ****
>>> >           final ReentrantLock lock = this.lock;
>>> >           lock.lock();
>>> >           try {
>>> > -             // Copy while checking if already present.
>>> > -             // This wins in the most common case where it is not
>>> present
>>> >
>>> >               Object[] elements = getArray();
>>> >               int len = elements.length;
>>> > -             Object[] newElements = new Object[len + 1];
>>> >
>>> >               for (int i = 0; i < len; ++i) {
>>> >                   if (eq(e, elements[i]))
>>> > !                     return false; // exit, throwing away copy
>>> > !                 else
>>> > !                     newElements[i] = elements[i];
>>> >
>>> >               }
>>> >               newElements[len] = e;
>>> >               setArray(newElements);
>>> >               return true;
>>> > --- 579,591 ----
>>> >           final ReentrantLock lock = this.lock;
>>> >           lock.lock();
>>> >           try {
>>> >
>>> >               Object[] elements = getArray();
>>> >               int len = elements.length;
>>> >               for (int i = 0; i < len; ++i) {
>>> >                   if (eq(e, elements[i]))
>>> > !                     return false;
>>> >               }
>>> > +             Object[] newElements = Arrays.copyOf(elements, len + 1);
>>> >
>>> >               newElements[len] = e;
>>> >               setArray(newElements);
>>> >               return true;
>>> >
>>> >
>>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130408/83475121/attachment-0001.html>


More information about the Concurrency-interest mailing list