[concurrency-interest] Implicit parallelism

Michael Barker mikeb01 at gmail.com
Tue Aug 13 16:55:13 EDT 2013

> What java *should* do is perform loop-vectorization like GCC or LLVM do.
> I.e. unroll loops and parallelize - say - 4 iterations with SSE/AVX
> instructions where they're data-independent.

AFAICT, the JVM already does this, for simple loops anyway.  E.g.

for (int i = 0; i < a.length && i < b.length && i < c.length; i++) {
   c[i] = a[i] + b[i];

If you print the assembly you will see the loop unrolled and
appropriate VMOVcc, VADDcc instructions.  I compared the above code to
some hand vectorised code using AVX and the difference was that Java
was slower by only around 5% when compiled with gcc -O3.  Without the
optimisation switch for gcc the Java code was twice as fast.


More information about the Concurrency-interest mailing list