[concurrency-interest] strange performance issue with ArrayBlockingQueue on Linux

Martin Buchholz martinrb at google.com
Wed Jul 19 23:57:30 EDT 2017


One obvious variable is the version of ABQ you are using.  It was modified
in jdk9 and you should be doing all performance work with latest jdk9
binaries e.g. from http://jdk.java.net/9/

On Wed, Jul 19, 2017 at 8:41 PM, Matthews, Graham <gmatthew at callutheran.edu>
wrote:

> ​[apologies in advance if this is the wrong place to discuss this issue.
> also apologies for the long setup -- I promise this is about
> ArrayBlockingQueue, Java, and a performance issue with them]
>
>
> My student and I have been benchmarking various ways to run cross data
> source joins. The basic setup is as follows:
>
>
>            J2
>
>            /  \
>
>           /    \
>
>         R3  J1
>
>               /  \
>
>              /    \
>            R2  R1
>
> R1, R2 and R3 are called receive operators as they receive data from the
> databases. J1 and J2 are inner join operators joining values from R2 with
> values from R1 and then the result with values from R3.
>
> One way to run these queries is the classic producers and consumers
> approach -- so each operator is a Java thread, and there are bounded queues
> (ArrayBlockingQueues) between the operators. So the above example has
> 5 threads and 4 ArrayBlockingQueues. So here we interleave I/O with
> computation.
>
> Another way to run this is to first load R3 into J2 in parallel with
> loading R2 into J1 (the loads build hash maps in J1 and J2), and then have
> a thread pool with N workers (where N is the number of cores), where each
> worker takes a value from R1 and joins it "up the spine" of the tree -- so
> matches against the hash map in J1 and then J2. This approach (the phased
> approach) therefore has an I/O phase, followed by a computation phase.
>
>
> On OS X El Capitan (10.11.6), the queueing approach and the phased approach
> perform roughly equally (more variation in the run times for the queueing
> approach, but performance within 10% of each other). On Linux (Ubuntu
> 16.04 LTS) the queueing approach is almost three times slower than the
> phased approach!
>
> On top of that the queueing approach runs over twice as fast on OS X as it
> is does on Linux, despite the fact that I am running OS X on old hardware
> (mid 2012 Macbook Pro, 3rd Gen Intel Core i5 3210M) and Linux on
> relatively new hardware (6th Gen Intel Core i7-5500U).
>
> We have boiled the problem down to code that doesn't involve databases --
> just threads and ArrayBlockingQueues. Attached are 5 Java files which
> "approximate" the queueing approach by having each receive thread simply
> output numbers (rather than get values from a database) to
> an ArrayBlockingQueue, and each Join thread simply read its two input
> queues, adds the values together, and then output the sum to the output
> queue.
>
> On the same hardware above this code runs in 1 minute 30 secs on OS X,
> and 2 minutes 13 secs on Linux!
>
> The code is very simple -- just a bunch of threads and a bunch of
> ArrayBlockingQueues. So I am at a loss to explain the size-able performance
> differences noted above.
>
> Again apologies if this is not the right forum to discuss this issue.
>
> graham
>
>> ------
> Graham MATTHEWS, Ph.D.
> Assistant Professor
> Department of Computer Science |  California Lutheran University
> 60 W. Olsen Rd.  |  Thousand Oaks, CA 91360
> gmatthew at callutheran.edu  | Office location: D19
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20170719/d8cdfac6/attachment-0001.html>


More information about the Concurrency-interest mailing list