[concurrency-interest] RFR [8011215] optimization of CopyOnWriteArrayList.addIfAbsent()

Martin Buchholz martinrb at google.com
Fri Apr 12 16:24:37 EDT 2013


I persuaded Doug to accept my suggestions for remove(Object) and
addIfAbsent:

--- src/main/java/util/concurrent/CopyOnWriteArrayList.java 12 Apr 2013
18:44:19 -0000 1.103
+++ src/main/java/util/concurrent/CopyOnWriteArrayList.java 12 Apr 2013
20:20:10 -0000
@@ -498,35 +498,44 @@
      * @return {@code true} if this list contained the specified element
      */
     public boolean remove(Object o) {
+        Object[] snapshot = getArray();
+        int index = indexOf(o, snapshot, 0, snapshot.length);
+        return (index < 0) ? false : remove(o, snapshot, index);
+    }
+
+    /**
+     * A version of remove(Object) using the strong hint that given
+     * recent snapshot contains o at the given index.
+     */
+    private boolean remove(Object o, Object[] snapshot, int index) {
         final ReentrantLock lock = this.lock;
         lock.lock();
         try {
-            Object[] elements = getArray();
-            int len = elements.length;
-            if (len != 0) {
-                // Copy while searching for element to remove
-                // This wins in the normal case of element being present
-                int newlen = len - 1;
-                Object[] newElements = new Object[newlen];
-
-                for (int i = 0; i < newlen; ++i) {
-                    if (eq(o, elements[i])) {
-                        // found one;  copy remaining and exit
-                        for (int k = i + 1; k < len; ++k)
-                            newElements[k-1] = elements[k];
-                        setArray(newElements);
-                        return true;
-                    } else
-                        newElements[i] = elements[i];
-                }
-
-                // special handling for last cell
-                if (eq(o, elements[newlen])) {
-                    setArray(newElements);
-                    return true;
+            Object[] current = getArray();
+            int len = current.length;
+            if (snapshot != current) findIndex: {
+                int prefix = Math.min(index, len);
+                for (int i = 0; i < prefix; i++) {
+                    if (current[i] != snapshot[i] && eq(o, current[i])) {
+                        index = i;
+                        break findIndex;
+                    }
                 }
+                if (index >= len)
+                    return false;
+                if (current[index] == o)
+                    break findIndex;
+                index = indexOf(o, current, index, len);
+                if (index < 0)
+                    return false;
             }
-            return false;
+            Object[] newElements = new Object[len - 1];
+            System.arraycopy(current, 0, newElements, 0, index);
+            System.arraycopy(current, index + 1,
+                             newElements, index,
+                             len - index - 1);
+            setArray(newElements);
+            return true;
         } finally {
             lock.unlock();
         }
@@ -576,16 +585,31 @@
      * @return {@code true} if the element was added
      */
     public boolean addIfAbsent(E e) {
+        Object[] snapshot = getArray();
+        return indexOf(e, snapshot, 0, snapshot.length) >= 0 ? false :
+            addIfAbsent(e, snapshot);
+    }
+
+    /**
+     * A version of addIfAbsent using the strong hint that given
+     * recent snapshot does not contain e.
+     */
+    private boolean addIfAbsent(E e, Object[] snapshot) {
         final ReentrantLock lock = this.lock;
         lock.lock();
         try {
-            Object[] elements = getArray();
-            int len = elements.length;
-            for (int i = 0; i < len; ++i) {
-                if (eq(e, elements[i]))
-                    return false;
+            Object[] current = getArray();
+            int len = current.length;
+            if (snapshot != current) {
+                // Optimize for lost race to another addXXX operation
+                int common = Math.min(snapshot.length, len);
+                for (int i = 0; i < common; i++)
+                    if (current[i] != snapshot[i] && eq(e, current[i]))
+                        return false;
+                if (indexOf(e, current, common, len) >= 0)
+                        return false;
             }
-            Object[] newElements = Arrays.copyOf(elements, len + 1);
+            Object[] newElements = Arrays.copyOf(current, len + 1);
             newElements[len] = e;
             setArray(newElements);
             return true;



On Mon, Apr 8, 2013 at 10:45 AM, Martin Buchholz <martinrb at google.com>wrote:

> It would be nice if someone somewhere could confirm that addIfAbsent was
> frequently called with element present on their codebase.
>
> But even without any such guidance, I think it makes sense to optimize for
> operations not needing to update, for both addIfAbsent and remove(Object).
>  Operations that do are read only and don't acquire the lock have infinite
> scalability, while the penalty for having everything go wrong while having
> to do a second traversal is typically small, and at worst a factor of two.
>
> Another way to look at it is that COW data structures should be used in a
> read-mostly fashion.  So optimize assuming that operations like addIfAbsent
> that may or may not be read-only, are in fact read-only.
>
>
>
> On Fri, Apr 5, 2013 at 11:31 PM, Stanimir Simeonoff <stanimir at riflexo.com>wrote:
>
>> I was thinking exactly the same as Martin Buhholz however I could not
>> find any single use in the our code where it'd be profitable to check
>> outside the lock optimistically. addIfAbsent is fairly used in the form of
>> COWSet. i.e. adds at the end, random removals and some COWSets may contain
>> a few thousands entries. However it's virtually guaranteed addIfAbsent to
>> return true, while remove(E) may be racily called and return false, i.e. an
>> optimistic approach to remove() might be beneficial.
>>
>> On Sat, Apr 6, 2013 at 2:35 AM, Vitaly Davidovich <vitalyd at gmail.com>wrote:
>>
>>> Personally, I don't see a reason to further optimize for cache hits -
>>> are we really saying that's the common usage of COWAL? I'd find that hard
>>> to believe.  At some point, it's probably better to simply externalize this
>>> policy via constructor hint or subclass or whatever rather than catering to
>>> both sides in shared code.
>>>
>>> Having c-tor hint would require to propagate it would COWSet too. IMO,
>> addIfAbsent is mostly used by COWSet not COWList itself, as it doesn't
>> require the concrete type to be declared rather than the interface.
>> Also that will not affect the existing code bases and very few would
>> actually know the proper use case or the use cases would remain the same.
>> If the element is likely to already exists and the users are aware of the
>> current implementation it's likely to be guarded by contains(), such cases
>> require refactoring to take benefit and the code would require new java
>> version - unlikely to happen for large projects.
>> If there is higher contention and smaller array-sizes busy waiting w/
>> some backoff might be even better.
>>
>> Stanimir
>>
>>>  Sent from my phone
>>> On Apr 5, 2013 5:28 PM, "Martin Buchholz" <martinrb at google.com> wrote:
>>>
>>>> I'm still advocating an optimistic approach, that does not hold the lock
>>>> during the first traversal of the array snapshot.  This is much faster
>>>> when
>>>> cache hits are the norm (or equals methods are expensive), which I would
>>>> hope would be the common case, and only slightly slower when all adds
>>>> are
>>>> cache misses.
>>>>
>>>>      public boolean addIfAbsent(E e) {
>>>>         Object[] snapshot = getArray();
>>>>         int len = snapshot.length;
>>>>         for (int i = 0; i < len; i++)
>>>>             if (eq(e, snapshot[i]))
>>>>                 return false;
>>>>         return addIfAbsent(e, snapshot);
>>>>     }
>>>>
>>>>     private boolean addIfAbsent(E e, Object[] snapshot) {
>>>>         final ReentrantLock lock = this.lock;
>>>>         lock.lock();
>>>>         try {
>>>>             Object[] current = getArray();
>>>>             int len = current.length;
>>>>             if (snapshot != current) {
>>>>                 // Optimize for contending with another addIfAbsent
>>>>                 int common = Math.min(snapshot.length, len);
>>>>                 for (int i = 0; i < common; i++)
>>>>                     if (current[i] != snapshot[i] && eq(e, current[i]))
>>>>                         return false;
>>>>                 for (int i = common; i < len; i++)
>>>>                     if (eq(e, current[i]))
>>>>                         return false;
>>>>             }
>>>>             Object[] newElements = Arrays.copyOf(current, len + 1);
>>>>             newElements[len] = e;
>>>>             setArray(newElements);
>>>>             return true;
>>>>         } finally {
>>>>             lock.unlock();
>>>>         }
>>>>     }
>>>>
>>>>
>>>>
>>>> On Fri, Apr 5, 2013 at 5:27 AM, Doug Lea <dl at cs.oswego.edu> wrote:
>>>>
>>>> > On 04/03/13 06:35, Doug Lea wrote:
>>>> >
>>>> >> This was designed to perform best in the case of possibly contended
>>>> >> updates when the element is absent, by avoiding retraversal, and
>>>> >> thus minimizing lock hold times in the common case. (When not common,
>>>> >> it can be guarded by a contains check.) However even in this case,
>>>> >> it is possible that a retraversal via arraycopy could be faster
>>>> >> because it can use optimized cheaper writes (fewer card marks).
>>>> >>
>>>> >
>>>> > Yes, by a little.
>>>> > A simple but reliable performance test is now at
>>>> >
>>>> http://gee.cs.oswego.edu/cgi-**bin/viewcvs.cgi/jsr166/src/**test/loops/**
>>>> > COWALAddIfAbsentLoops.java?**view=log<
>>>> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/loops/COWALAddIfAbsentLoops.java?view=log
>>>> >
>>>>
>>>> >
>>>> > The simplest change allowing this (below) also appears to
>>>> > be among the fastest. Running across various machines and
>>>> > settings (GC, client/server), it seems to be between 5% and 15%
>>>> > faster. This is a smaller difference than in Ivan's tests,
>>>> > that didn't include lock and contention effects.
>>>> >
>>>> > I committed jsr166 version. We'll need to sync this up with
>>>> > with openjdk tl someday, but might as well wait until
>>>> > other updates for Spliterators/streams are ready to integrate.
>>>> >
>>>> > -Doug
>>>> >
>>>> > *** CopyOnWriteArrayList.java.~1.**100.~  Tue Mar 12 19:59:08 2013
>>>>
>>>> > --- CopyOnWriteArrayList.java   Fri Apr  5 08:03:29 2013
>>>> > ***************
>>>> > *** 579,595 ****
>>>> >           final ReentrantLock lock = this.lock;
>>>> >           lock.lock();
>>>> >           try {
>>>> > -             // Copy while checking if already present.
>>>> > -             // This wins in the most common case where it is not
>>>> present
>>>> >
>>>> >               Object[] elements = getArray();
>>>> >               int len = elements.length;
>>>> > -             Object[] newElements = new Object[len + 1];
>>>> >
>>>> >               for (int i = 0; i < len; ++i) {
>>>> >                   if (eq(e, elements[i]))
>>>> > !                     return false; // exit, throwing away copy
>>>> > !                 else
>>>> > !                     newElements[i] = elements[i];
>>>> >
>>>> >               }
>>>> >               newElements[len] = e;
>>>> >               setArray(newElements);
>>>> >               return true;
>>>> > --- 579,591 ----
>>>> >           final ReentrantLock lock = this.lock;
>>>> >           lock.lock();
>>>> >           try {
>>>> >
>>>> >               Object[] elements = getArray();
>>>> >               int len = elements.length;
>>>> >               for (int i = 0; i < len; ++i) {
>>>> >                   if (eq(e, elements[i]))
>>>> > !                     return false;
>>>> >               }
>>>> > +             Object[] newElements = Arrays.copyOf(elements, len + 1);
>>>> >
>>>> >               newElements[len] = e;
>>>> >               setArray(newElements);
>>>> >               return true;
>>>> >
>>>> >
>>>>
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130412/41c66d6c/attachment-0001.html>


More information about the Concurrency-interest mailing list