[concurrency-interest] ParallelArray Extension

Tim Peierls tim at peierls.net
Wed Dec 19 15:09:33 EST 2007


On Dec 19, 2007 12:31 PM, Kasper Nielsen <kasper at kav.dk> wrote:

> >     Tim Peierls wrote:
> >      > These sound like applications for map-reduce:
> >      > 1. w.withMapping(elementsToSingletonSets).reduce(combineSets);
> >      > 2. w.withMapping(elementsToAggregates).reduce(combineAggregates);
> >      > ...seem essentially equivalent to your proposed combineReduce.
> >
> >     From a functional standpoint this works fine, but it doesn't really
> >     perform. If I create a new object in withMapping. I would need to
> create
> >     a new HashSet/Aggregate for each element in ParallelArray.
> >
> > Aren't you effectively doing the same thing in your combineReduce
> examples?
>
> No, only one for each thread:
> T1:
> Aggregate a=combine(1, null); <- Create new Aggregate
>           a=combine(2, a); <-returns a
>           a=combine(3, a); <-returns a
> ....
> T2:
> Aggregate a=combine(8, null); <- Create new Aggregate
>           a=combine(9, a); <-returns a
>           a=combine(10, a); <-returns a
> ....
>

ParallelArray.combine works through a subclass of RecursiveAction, so that
there would be as many calls to combine(x, null) as there are leaves in the
recursive decomposition, more or less -- probably less since Doug does some
fancy things to improve on this. So I think the number of Aggregate
constructions in combineReduce would be only a constant factor (related to
parallelism level) less than in map-reduce. I don't think it would be
directly related to the number of threads available to the ForkJoinPool.

Aggregates, as you present them, are lightweight objects. It's quite
possible that the cost of creating one for each element is dominated by the
parallelism benefits when you have a lot of processors.

With the Sets, it isn't so simple. Instead of Set, I'd want to use a purely
functional data structure designed for efficient merging, such as found in
Chris Okasaki's book, *Purely Functional Data Structures*. But presumably
Doug can find a way to do this even more efficiently internally.

--tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /pipermail/attachments/20071219/caa73d5b/attachment.html 


More information about the Concurrency-interest mailing list