[concurrency-interest] Blocking vs. non-blocking
arcadiy at ivanov.biz
Wed Jun 25 21:35:21 EDT 2014
Based on what I've read in the benchmark code supplied, I have a
sneaking suspicion that the problem is the benchmark itself, since I'm
not sure it's measuring "thread switching performance". :)
I would *highly* recommend talking to Aleksey about using JMH for all
your benchmarking needs and would, personally, dramatically simplify the
benchmark by using exclusively LockSupport.park/unpark if your intention
is to measure thread parking/unparking time.
Be aware, that System.nanoTime (I would not recommend measuring anything
faster than your average human heartbeat with System.currentTimeMillis()
- resolution is 1ms at best) performs weirdly under contention
(http://shipilev.net/blog/2014/nanotrusting-nanotime/). I.e. you may not
be able to measure what you want to measure using Java benchmark to
begin with and may need lower level native OS/CPU metrics tools.
Also, I haven't looked at or, frankly, used synchronized in quite a
while, but I think it results, assuming lock elision doesn't kick in, in
libc mutexes of some sort (pthread? libthread? win32 mutex?), i.e. you
may be measuring lock performance in a libc on the specific OSes, not
thread switching performance.
Hope this helps,
On 2014-06-25 21:05, Dennis Sosnoski wrote:
> On 06/14/2014 02:32 PM, Arcadiy Ivanov wrote:
>> If memory serves me right, Mr Shipilev mentioned in one of his
>> presentations in Oracle Spb DC re FJP optimization challenges (in
>> Russian, sorry, https://www.youtube.com/watch?v=t0dGLFtRR9c#t=3096)
>> that thread scheduling overhead of "sane OSes" (aka Linux) is approx
>> 50 us on average, while 'certain not-quite-sane OS named starting
>> with "W"' is much more than that.
>> Loaded Linux kernel can produce latencies in *tens of seconds*
>> page 13) without RT patches, and tens of us with RT ones. YMMV
>> dramatically depending on kernel, kernel version, scheduler,
>> architecture and load.
> I actually found that Windows 7 did much better at thread switching
> performance than my Linux system with same-era kernel when running on
> my laptop system (Windows 7 Home Premium, Linux 3.4.63,Toshiba
> Satellite P750D with AMD A8-3520M APU). You can see my timing results
> here: http://www.sosnoski.com/thread-linux-windows.png The data block
> size relates to a block of per-thread data run though on every thread
> switch to show caching effects. Threads are executed in strict
> rotation, each notifying the next to run. The actual code is at:
> So now I'm wondering if recent Windows versions actually have lower
> thread switching overhead in general, or if there are perhaps some
> OS-specific optimizations for the particular hardware (the Windows
> installation came with the laptop; I added Linux myself, generic
> OpenSUSE without any optimizations). Anyone have any thoughts on this?
> - Dennis
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Concurrency-interest