[concurrency-interest] Blocking vs. non-blocking

Doug Lea dl at cs.oswego.edu
Fri Jun 13 19:57:34 EDT 2014


On 06/13/2014 07:35 PM, Dennis Sosnoski wrote:
> I'm writing an article where I'm discussing both blocking waits and non-blocking
> callbacks for handling events. As I see it, there are two main reasons for
> preferring non-blocking:
>
> 1. Threads are expensive resources (limited to on the order of 10000 per JVM),
> and tying one up just waiting for an event completion is a waste of this resource
> 2. Thread switching adds substantial overhead to the application
>
> Are there any other good reasons I'm missing?

Also memory locality (core X cache effects).

>
> On the thread switching issue, I tried a simple timing test where I create some
> number of threads and have them take turns incrementing a value, each one
> passing control off to the next after an increment. For a total of 4096*100
> increments here's what I got on my 4-core AMD Linux desktop running Java 7:
>
> Took 44 ms. with 1 threads
> Took 3805 ms. with 2 threads
> Took 6172 ms. with 4 threads
> Took 6185 ms. with 8 threads
> Took 6437 ms. with 16 threads
> Took 6831 ms. with 32 threads
> Took 6756 ms. with 64 threads
> Took 6511 ms. with 128 threads
> Took 6975 ms. with 256 threads
> Took 7264 ms. with 512 threads
> Took 7185 ms. with 1024 threads
> Took 6826 ms. with 2048 threads
> Took 7639 ms. with 4096 threads
>
> So a big drop in performance going from one thread to two, and again from 2 to
> 4, but after than just a slowly increasing trend. That's about 19 microseconds
> per switch with 4096 threads, about half that time for just 2 threads. Do these
> results make sense to others?

Your best case of approximately 20 thousand clock cycles is not an
unexpected result on a single-socket multicore with all cores turned
on (i.e., no power management, fusing, or clock-step effects)
and only a few bouncing cachelines.

We've seen cases of over 1 million cycles to unblock a thread
in some other cases. (Which can be challenging for us to deal
with in JDK8 Stream.parallel(). I'll post something on this sometime.)
Maybe Aleksey can someday arrange to collect believable
systematic measurements across a few platforms.

-Doug




More information about the Concurrency-interest mailing list