[concurrency-interest] Unexpected Scalability results in Java Fork-Join (Java 8)

Kirk Pepperdine kirk at kodewerk.com
Tue Jul 28 05:04:15 EDT 2015


You might find that Hotspot does a better job optimizing things in 1.8. I’ll be interested in running your bench comparing it to mine.

— Kirk

On Jul 28, 2015, at 9:40 AM, steadria <Steven.Adriaensen at vub.ac.be> wrote:

> Dear all,
> 
> Recently, I was running some scalability experiments using Java Fork-Join. Here, I used the non-default ForkJoinPool constructor `ForkJoinPool(int parallelism)`, passing the desired parallelism level (# workers = P) as argument.
> 
> Specifically, using the piece of code attached, I got these results on a processor with 4 physical and 8 logical cores (Using java 8: jre1.8.0_45):
> 
> T1: 11730
> T2: 2381 (speedup: 4,93)
> T4: 2463 (speedup: 4,76)
> T8: 2418 (speedup: 4,85)
> 
> While when using java 7 (jre1.7.0), I get
> 
> T1: 11938
> T2: 11843 (speedup: 1,01)
> T4: 5133 (speedup: 2,33)
> T8: 2607 (speedup: 4,58)
> 
> (where TP is the execution time in ms, using parallelism level P)
> 
> While both results surprise me, the latter I can understand (the join will cause 1 worker (executing the loop) to block, as it
> fails to recognize that it could, while waiting, process other pending dummy tasks from its local queue). The former, however, got me puzzled.
> 
> Running further experiments on a 64-core SMP machine (jdk1.8.0_45), using the JMH benchmarking tool (= 1 fork, 50 iterations (+ 50 warmup))
> 
> I got the results below
> 
> T1: 23.831
> 
>  23.831 ±(99.9%) 0.116 s/op [Average]
>  (min, avg, max) = (23.449, 23.831, 24.522), stdev = 0.234
>  CI (99.9%): [23.715, 23.947] (assumes normal distribution)
> 
> 
> T2: 2.927 (speedup: 8.14)
> 
>  2.927 ±(99.9%) 0.091 s/op [Average]
>  (min, avg, max) = (2.655, 2.927, 3.405), stdev = 0.184
>  CI (99.9%): [2.836, 3.018] (assumes normal distribution)
> 
> T64: 1.550 (speedup: 15.37)
> 
>  1.550 ±(99.9%) 0.027 s/op [Average]
>  (min, avg, max) = (1.460, 1.550, 1.786), stdev = 0.054
>  CI (99.9%): [1.523, 1.577] (assumes normal distribution)
> 
> 
> My current theory:
> 
> I guess one explanation would be that the worker executing the parallel loop does not go idle in java 8, but instead finds other work to perform. Furthermore, I suspect there might be a 'bug' in this mechanism, which causes more workers to be active (i.e. consuming resources) than the desired level of parallelism (P) passed as constructor argument, explaining the super-linear speedup observed.
> 
> I was wondering whether someone of you has a better/other explanation? Clearly the use of the java FJ framework in code attached is not 100% kosher, however to my knowledge it doesn't violate any of the framework's preconditions either?! Note that scalability results are 'as expected', when dummy tasks are joined in reverse order.
> 
> I really appreciate any help you can provide,
> 
> Steven Adriaensen
> PhD Student
> Vrije Universiteit Brussel
> Brussels, Belgium<MinimalExample.java>_______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20150728/66c8ce98/attachment.bin>


More information about the Concurrency-interest mailing list