[concurrency-interest] ParallelArray Extension

Doug Lea dl at cs.oswego.edu
Wed Dec 19 10:10:30 EST 2007

Kasper: Thanks for trying things out! Reports like this are
extremely helpful! We need to know what people are doing with
these APIs and what problems they encounter.

Tim Peierls wrote:
> These sound like applications for map-reduce:
> 1. w.withMapping(elementsToSingletonSets).reduce(combineSets);
> 2. w.withMapping(elementsToAggregates).reduce(combineAggregates);

In which case, the question amounts to whether it is worth
special support to uniquify elements. Probably so, because
in this case we can internally exploit some parallelism
that is otherwise hard for people to do themselves.
This suggests adding:

  ParallelArray uniqueElements();    // kill duplicates
  ParallelArray uniqueConsecutiveElements(); //  assumes array already sorted
  ParallelArray uniqueElements(ParallelArray a); // merge non-dups
  ParallelArray uniqueConsecutiveElements(ParallelArray a); // assume both sorted

(And possibly some mutatuve (versus constructive) forms.)

The broader issue is that ParallelArray is list/array-like,
but with some map, set, and ordering support. Eventually we also want
full map/set-like, as well as sorted-map/set-like classes. These are
more challenging to implement in ways that give you as good
speed-ups as array/list style: You lose some of the advantages
of data contiguity and fast (implicit) split/merge. Also, map/set
style invites a more relational-database-like API. This is
not TOO different than ParallelArray though. Notice that
pa.withFilter(...).withMapping(...) is basically similar to
SQL select/project. But my guess based on what the PLINQ (C#)
folks are doing is that some people will want further extensions,
lead to all sorts of syntax/tool/etc issues that I'd rather avoid.

Anyway, ParallelMap and friends will probably eventually make it in.
But in the mean time, it is likely that using ParallelArray to
emulate set/map semantics will work reasonably well, especially
if we add a few methods to simplify emulation.


More information about the Concurrency-interest mailing list