[concurrency-interest] Blocking vs. non-blocking

Arcadiy Ivanov arcadiy at ivanov.biz
Wed Jun 25 21:35:21 EDT 2014


Based on what I've read in the benchmark code supplied, I have a 
sneaking suspicion that the problem is the benchmark itself, since I'm 
not sure it's measuring "thread switching performance". :)

I would *highly* recommend talking to Aleksey about using JMH for all 
your benchmarking needs and would, personally, dramatically simplify the 
benchmark by using exclusively LockSupport.park/unpark if your intention 
is to measure thread parking/unparking time.

Be aware, that System.nanoTime (I would not recommend measuring anything 
faster than your average human heartbeat with System.currentTimeMillis() 
- resolution is 1ms at best) performs weirdly under contention 
(http://shipilev.net/blog/2014/nanotrusting-nanotime/). I.e. you may not 
be able to measure what you want to measure using Java benchmark to 
begin with and may need lower level native OS/CPU metrics tools.

Also, I haven't looked at or, frankly, used synchronized in quite a 
while, but I think it results, assuming lock elision doesn't kick in, in 
libc mutexes of some sort (pthread? libthread? win32 mutex?), i.e. you 
may be measuring lock performance in a libc on the specific OSes, not 
thread switching performance.

Hope this helps,

- Arcadiy

On 2014-06-25 21:05, Dennis Sosnoski wrote:
> On 06/14/2014 02:32 PM, Arcadiy Ivanov wrote:
>> If memory serves me right, Mr Shipilev mentioned in one of his 
>> presentations in Oracle Spb DC re FJP optimization challenges (in 
>> Russian, sorry, https://www.youtube.com/watch?v=t0dGLFtRR9c#t=3096) 
>> that thread scheduling overhead of "sane OSes" (aka Linux) is approx 
>> 50 us on average, while 'certain not-quite-sane OS named starting 
>> with "W"' is much more than that.
>> Loaded Linux kernel can produce latencies in *tens of seconds* 
>> (http://www.versalogic.com/downloads/whitepapers/real-time_linux_benchmark.pdf, 
>> page 13) without RT patches, and tens of us with RT ones. YMMV 
>> dramatically depending on kernel, kernel version, scheduler, 
>> architecture and load.
>>
>
> I actually found that Windows 7 did much better at thread switching 
> performance than my Linux system with same-era kernel when running on 
> my laptop system (Windows 7 Home Premium, Linux 3.4.63,Toshiba 
> Satellite P750D with AMD A8-3520M APU). You can see my timing results 
> here: http://www.sosnoski.com/thread-linux-windows.png The data block 
> size relates to a block of per-thread data run though on every thread 
> switch to show caching effects. Threads are executed in strict 
> rotation, each notifying the next to run. The actual code is at: 
> https://github.com/dsosnoski/concur3/blob/master/src/com/sosnoski/concur/article3/ThreadSwitch.java
>
> So now I'm wondering if recent Windows versions actually have lower 
> thread switching overhead in general, or if there are perhaps some 
> OS-specific optimizations for the particular hardware (the Windows 
> installation came with the laptop; I added Linux myself, generic 
> OpenSUSE without any optimizations). Anyone have any thoughts on this?
>
> Thanks,
>
>   - Dennis
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20140625/32f83a60/attachment-0001.html>


More information about the Concurrency-interest mailing list