[concurrency-interest] forkjoin.ParallelArray and friends

Neal Gafter neal at gafter.com
Mon Aug 27 18:00:25 EDT 2007


The class hierarchy appears to be carefully designed to support specific use
cases. But once you step outside them things don't fare so well. I think
this was intentional to force users into the patterns of code that are most
efficiently handled in the current implementation. But perhaps there is
something about the way the API is supposed to fit together that I just
don't understand.

For example, if I have a parallel array a, I can perform a.withMapping(...)
to map it onto another sequence-like thing. But the resulting thing doesn't
have a withMapping operation, so it isn't possible to apply a mapping (or a
filter) after a mapping; you can't a.withMapping(...).withMapping(...).  A
ParallelArray.WithMapping appears to implement the mapping lazily (on an
as-needed basis). That's a good thing, for a number of reasons. Once you've
mapped, though, the result can only be used to reduce. If you want to map
again, you have to materialize the whole thing using newArray(). That
materializes all the values by applying the mapping (eagerly). That seems an
arbitrary point to force the client to go from lazy to eager evaluation of
the elements. It isn't clear it will result in effective concurrency for
clients that are forced down this route.  I was thinking maybe I could get
the same results (at the cost of reorganizing the client code) by applying a
"composition" operation to two Mappers, but I didn't find a composition
operation in the framework.

I think the whole API should be interface-based.  For example, I think the
results of a.withMapping() should be the ParallelArray interface or some
subtype.  Implement efficiently what you can today.  If the implementation
is interface-based, you can expand the set of use cases that are
efficiently-implemented in the future without changing the API.


On 8/27/07, Doug Lea <dl at cs.oswego.edu> wrote:
> As people who have been keeping track of the new fine-grained parallelism
> framework might have noticed, we are firming up the aggregate operations
> APIs. The main APIs surround class ParallelArray, which provides
> apply, map, reduce, select, transform etc operations. For javadocs, see:
> http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166ydocs/jsr166y/forkjoin/ParallelArray.html
> (And for jars, sources etc, see the usual places linked at
> http://gee.cs.oswego.edu/dl/concurrency-interest/index.html)
> While I expect a few minor changes will occur, unlike some previous
> forms, these classes no longer say that they are standins for
> functionality that will be developed. We think the APIs and
> code are just about ready for routine (non-production) use,
> and so invite you to uses them and report back any comments
> and suggestions. (Now is your big chance before things become
> too entrenched to change.)
> Here's something pasted from javadocs that gives a feeling
> for how you use ParallelArrays:
>    import static Ops.*;
>    class StudentStatistics {
>      ParallelArray<Student> students = ...
>      // ...
>      public double getMaxSeniorGpa() {
>        return students.withFilter(isSenior).withMapping(gpaField).max();
>      }
>      // helpers:
>      static final class IsSenior implements Predicate<Student> {
>        public boolean evaluate(Student s) { return s.credits > 90; }
>      }
>      static final IsSenior isSenior = new IsSenior();
>      static final class GpaField implements MappertoDouble<Student> {
>        public double map(Student s) { return s.gpa; }
>      }
>      static final GpaField gpaField = new GpaField();
>    }
> -Doug
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at altair.cs.oswego.edu
> http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/attachments/20070827/6bfa57f0/attachment.html 

More information about the Concurrency-interest mailing list