[concurrency-interest] Matrix multiply with parallelized inner product

Joe Bowbeer joe.bowbeer at gmail.com
Mon Feb 4 13:16:06 EST 2008

On Feb 4, 2008 9:25 AM, Hanson Char wrote:
> > "The only way I could see this approach being practical is when the number
> > of processors greatly exceeds the number of columns in the result."
> >
> > I'd use RecursiveTask/Action instead if tempted to use nested PA calls.
> If the number of processors greatly exceeds the number of columns,
> would using nested PA calls be significantly faster than using
> RecursiveTask/Action (in this case of matrix multiplication) ?
> It would be nice to have further code snippet on the wiki to
> illustrate the combination of the (non-nested) PA with
> RecursiveTask/Action for the inner product, if doing so would lead to
> a solution that can perform reasonably well regardless of the number
> of processors.  Or maybe the "forkJoinMatrixMultiply" method is
> already in general optimal ?

I suspect that nesting PAs would defeat its heuristics for
partitioning the computation.

Don't the chunking mechanisms in PA currently assume there is no
nesting?  To make nesting efficient, I think you'd need to design it
in at the top-level - but of course I say that about everything...


More information about the Concurrency-interest mailing list