[concurrency-interest] Reducing contention overhead by suspending worker threads
jseigh_cp00 at xemaps.com
Mon Mar 19 20:23:46 EDT 2007
Peter Kovacs wrote:
>I've got a bounded blocking queue "outputQueue". A number of worker
>threads are feeding this queue and one single thread is consuming its
>elements. I've got two implementations of this setup with slightly
>different details and a performance difference of about 25-30%. The
>faster implementation is pretty messy and consequently difficult to
>compare with the clean(er) but slower implementation.
>The worker threads get sometimes blocked because the consumer cannot
>keep up with the pace of the result production and the outputQueue
>gets full. The only palpable difference between the slow and the fast
>implementation I can find is that at this point (outputQueue gets
>full) the worker threads of the faster implementation both go into
>wait, while one of the workers of the slower implementation goes into
>wait while the other get blocked by an outer intrinsic lock
>(protecting the input source for the workers).
>Now I speculate as follows: the slower guy is slower because the
>threads blocked on the intrinsic lock must be rescheduled anyway on
>the OS level: the only waiting thread must first be able to put its
>data on the output queue before the other can have a chance to go
>Now if I try to balance the number of worker threads so that the
>bounded queue does not possibly get full, then I can significantly
>reduce the reschedule overhead.
>Does this all makes sense? Or am I completely naive with this idea?
>May this kind of speculation hold any substance at all (reschedule
>overhead causing performance loss)?
Try a Semaphore with a ConcurrentLinkedQueue. In a artificially high
testcase, it ran about 1.5x faster than a blocking queue with a unfair
about 2x slower with a fair semaphore, so you might want to avoid the
I have a fast pathed semaphore which with a ConcurrentLinkedQueue will
3x faster than a blocking queue for both the unfair and fair case, the
former being slightly faster.
More information about the Concurrency-interest