[ Osvaldo Pinali Doederlein ] [concurrency-interest] need to use collection.sort w. CopyOnWriteArrayList but cant.....

Osvaldo Pinali Doederlein osvaldo@visionnaire.com.br
Thu, 03 Jun 2004 16:33:15 -0300

Joseph Bowbeer wrote:
> Osvaldo,
> Wouldn't your suggested fix, to use List indexing instead of List iteration,
> hurt performance when List is a LinkedList or one of its kin?

You are correct -- I forgot this additional reason to use ListIterator!
Then, the correct fix is only more complex, e.g., have special cases to
put sorted elements back in the list.  The best solution I can think is
something like this:

if (list instanceof RandomAccess) {
   // Must work, even though RandomAccess doesn't guarantee that
   // the list is modifiable, because sort() requires this
   loop to put elements with set();
} else try {
   loop to put elements with ListIterator;
catch (UnsupportedOperationException) {
   loop to put elements with clear() & add()

This provides maximum performance for the concrete classes that have
efficient set(index,element), those marked by RandomAccess (includes
CopyOnWriteArrayList).  Unfortunately there isn't a similar marker or
specification requiring modifiable iterators, so we must rely on an
exception for other cases.  Attempt first the current method with
iterators; this will work, but just in case it fails, try once again
with clear()/add().  The last method is not fail-safe either because
sort() doesn't require that the list supports these also-optional
operations, but I guess we won't find many lists that are mutable,
but neither are RandomAccess, nor support any of ListIterator.set()
and List.clear()/add().  Well, at least not in the Java core, and I
don't think such aberration is likely in custom lists.  Even if there
is some valid scenario, at least we'll make a best-effort so sort()
works correctly and efficiently for as much lists as possible.

If it's not possible to cover all cases, we should update the javadoc
of sort() to make clear that the list must either be RandomAccess,
or provide mutable iterators, or support add/clear().  A better idea
(maybe) is documenting List to require that no implementation can
be so horribly broken: mutable, but without mutable iterators, not
RandomAccess, and not resizable.  In either case, methods like sort()
wouldn't fail for any list that complies with the specs.

(And in my suggestion for enhanced for, just check for RandomAccess
instead of, or in addition to, checking for List).


> ----- Original Message ----- 
> From: "Osvaldo Pinali Doederlein" <osvaldo@visionnaire.com.br>
> To: <gregg.wonderly@pobox.com>
> Cc: "Joshua Bloch" <joshua.bloch@sun.com>; "Guy Katz" <gkatz@allot.com>;
> <concurrency-interest@altair.cs.oswego.edu>
> Sent: Thursday, June 03, 2004 9:15 AM
> Subject: Re: [ Osvaldo Pinali Doederlein ] [concurrency-interest] need to
> use collection.sort w. CopyOnWriteArrayList but cant.....
> Gregg G. Wonderly wrote:
>>>If you replicate the sort method as a static of the
>>>concrete class, and the programmer static-imports these too, and their
>>>more specific signature guarantees precedence, then "sort(myList)" would
>>>invoke CopyOnWriteArrayList.sort(CopyOnWriteArrayList).  But it's a huge
>>>hack, horrible design, and abuse of static imports :-(
>>I would suggest that perhaps the easist thing for people who need the
>>inheritance hierachy, or some other class relationship is to create
> wrapper
>>classes that make the language of programming fit the discriptive language
> of
>>the problem domain.  Then, you can change what you call, or work out
> different
>>details for your special case, and everyone can benefit from having a
> simpliar
>>base API that doesn't try to support a particular problem domain to the
>>exclusion of others.
> Unfortunately this won't work, and even if it did, it's a workaround
> for a bug.  The current implementation of Collections.sort() breaks
> the semantics of the collections API: sort(List) should work with any
> implementation of List (I'm not requiring a performant behavior, only
> correct behavior). The problem is that the contracts involved in this
> problem are multiple and have subtle dependencies:
> * List must provide a ListIterator; that's OK.
> * ListIterator is free to implement set() or not; that's OK too.
> * Collections.sort(List) promises to sort any List, with the single
>    (and obvious) condition that "specified list must be modifiable".
>    But that's not specific enough, it doesn't state "modifiable through
>    iterators", and such constraint wouldn't make any sense.
> The conclusion is that Collections.sort() is broken, and nobody should
> need workarounds for this.  In fact, it was always broken, only now
> the problem is evidenced by the inclusion in the J2SE of a List that
> doesn't support ListIterator.set().  And it's broken for any app; this
> is not specific to particular problem domains, and the fix has nothing
> to do with support for requisites (functional or non-) of any domain.
> The same bug would surface in previous releases with user-provided
> collections not supporting ListIterator.set().
> Your suggestion of subclassing CopyOnWriteArrayList would only work
> (cleanly, w/o new APIs like a member sort()) if you could override
> listIterator() to return a new iterator that implement set().  But the
> iterator provided by CopyOnWriteArrayList doesn't support this method
> for good reasons, look at the javadocs: the "snapshot"-style iterators
> are key in the high-performance design of the class (and performance
> is the single reason for the existence of this class, otherwise we're
> better off with a synchronizedList() of ArrayList). It's not like the
> COWIterator doesn't implement those methods due to sadism of Doug :-)
> Suggestion -- that differently from my previous ramblings about the
> whole design of Collections, where I unfortunately jumped to criticize
> Java's design before fully understanding the problem, should be viable
> for 1.5.0:  Change the implementation of Collections.sort(List) to not
> depend on any optional methods (or on the ListIterator at all), so the
> algorithms will work with any collection.  There aren't good reasons
> to rely on Iterators (instead of indexes) to simply iterate over Lists,
> except reuse of code that may serve also to other kinds of collections,
> but this is not even the case here.  The offending code is:
> public static <T extends Comparable<? super T>> void sort(List<T>list){
>    Object[] a = list.toArray();
>    Arrays.sort(a);
>    ListIterator<T> i = list.listIterator();
>    for (int j=0; j<a.length; j++) {
>      i.next();
>      i.set((T)a[j]);
>    }
> }
> Suggested fix:
> public static <T extends Comparable<? super T>> void sort(List<T>list){
>    Object[] a = list.toArray();
>    Arrays.sort(a);
>    for (int j=0; j<a.length; j++)
>      list.set(j, (T)a[j]);
> }
> Similar fix applies to sort(List<T> list, Comparator<? super T> c),
> and we may need to review other methods in the entire Java core for
> similar problems.  Searching the sources for ListIterator reveals all
> code that may contain this "bug pattern", and it's a very small number
> of sources (53 .java files, including many less-suspect files like
> several List and Iterator implementations, and com.sun.* packages).
> And we only need to review carefully the codes that use optional
> methods -- many need ListIterator only for backwards iteration.
> So I hope it's possible to make sure that no other instances of this
> bug will be left in Java.  BTW, this fix should be applied to older
> editions of J2SE too, I can post this in the Bug Parade.
> The new code is also faster because list indexing is more efficient than
> iterators: no object allocation and less polymorphic calls.  And as a
> new bonus, not worrying if the List supports updates through iterators.
> This reveals a new issue: enhanced for, at least in 1.5.0-beta2,
> always uses an Iterator (returned by iterator()) to walk through any
> collection, even when the collection is a List.  This is not a bug
> because the generated code doesn't use any optional methods, but it's
> a significant performance issue. A syntax-sugar feature like this
> should avoid any unnecessary overhead over "traditional" code.
> I suggest that javac examines the type of the collection in enhanced
> for and, if it is a List, generate code that uses indexing instead.
> (Even using the ListIterator, instead of the generic Iterator, would
> deliver some gain; early binding to narrower types tends to reduce
> the cost of polymorphism, or at least the cost of JIT optimizations).
> I hope this helps,
> Osvaldo

Osvaldo Pinali Doederlein                   Visionnaire Informática S/A
osvaldo@visionnaire.com.br                http://www.visionnaire.com.br
Arquiteto de Tecnologia                          +55 (41) 373-7400 #228