[concurrency-interest] j.u.c/backport performance on Windows

Peter Kovacs peter.kovacs.1.0rc at gmail.com
Mon Apr 2 07:57:48 EDT 2007

Hi Szabolcs,

I am grateful for your time spent reviewing my code.

The code in InputOrderedWorkUnitProcessor comes with two variants of
putting the WorkUnitData on the outputQueue *for testing purposes*
(they do the same thing in a slightly different way):
getNextInputImmediate uses LinkedBlockingQueue.put while
getNextInputWaitForOuputQueue use LinkedBlockingQueue.offer. (I
"manually" change the "workerWaitsForRoomInOutputQueue" variable and
recompile depending on which variant I am going to test with.)

My original backport-based implementation used the "put" method, but I
observed a performance improvement of about 25% with the "offer"
method on Linux. (I am trying to emulate with the "offer" method the
behaviour of an "old" implementation which uses exlusively primitive
language constructs and which "waits" when the output queue is full
[the outputQueue in that old implementation is our own
implementation]. This "old" implementation performs significantly
better on Linux and much better on Windows.) (I think I also tried the
version with the "put" method yesterday and had the same disappointing
result on Windows.)

The purpose of the InputOrderedWorkUnitProcessor class is to produce
results in the order the corresponding input items (one input item per
result) are found in the input source ("inputProducer"). The intended
purpose of the inputProducerLock is to make sure that retrieval of the
next input item (from "inputProducer" by worker threads) and putting
the corresponding ScheduledWorkDataUnit instance on the "outputQueue"
be atomic. Otherwise the result order maybe different from that of the
input order, like in the following scenario:
(1) Worker "A" takes an input item for result "Y";
(2) worker "B" takes an input item for result "Z";
(3) worker "B" puts the ScheduledWorkDataUnit instance for result "Z"
in the outputQueue;
(4) worker "A" puts the ScheduledWorkDataUnit instance for result "Y"
in the outputQueue.
Result "Z" would then be processed by the consumer before "Y", and the
order of the two results would be swapped.

Do I miss here an alternative mechanism for making sure that order is
kept in the output? (I might try devising some kind of timestamp- or
index-based mechanism, whereby the timestamp/index would be evaluated
by the consumer for keeping the right processing order, but my gut
feeling is that moving the burden of order-keeping from the producer
to the consumer would cause more problems than it would solve. Or
maybe some kind of priority queueing?, if there are any.)

>From this perspective, I am not sure how I could maintain the required
semantics whithout the the equivalent of an inputProducerLock. Please,
let me know, what you think.

Thanks a lot

On 4/2/07, Szabolcs Ferenczi <szabolcs.ferenczi at gmail.com> wrote:
> On 01/04/07, Peter Kovacs <peter.kovacs.1.0rc at gmail.com> wrote:
> > In more concrete terms: you can find the problematic backport-based
> > implementation here:
> > http://www.chemaxon.com/shared/pkovacs/concurrent/processors/InputOrderedWorkUnitProcessor.java
> > . "outputQueue" is the variable where I am using LinkedBlockingQueue.
> > And yes, my case falls into the multiple-producer-one-consumer
> > category.
> Now I think I can see your problem. You are using a proper bounded
> buffer implementation (LinkedBlockingQueue) but you short-cut or
> re-implement the synchronization of it. Try to refactor your program
> so that:
> - use the synchronized put method of the LinkedBlockingQueue instead of
>   the polling offer method
> - drop the extra handshake via the lock inputProducerLock
> - make the producer create an item first and then call the put method with
>   the prepared item
> I believe this helps with the performance difference as well.
> Best Regards,
> Szabolcs

More information about the Concurrency-interest mailing list