[concurrency-interest] Some interesting (confusing?) benchmark results
hans.boehm at hp.com
Tue Feb 12 17:00:29 EST 2013
I wouldn’t be surprised if a memory-bandwidth-limited benchmark ran fastest with fewer threads than cores.
From: concurrency-interest-bounces at cs.oswego.edu [mailto:concurrency-interest-bounces at cs.oswego.edu] On Behalf Of Nathan Reynolds
Sent: Tuesday, February 12, 2013 1:48 PM
To: concurrency-interest at cs.oswego.edu
Subject: Re: [concurrency-interest] Some interesting (confusing?) benchmark results
> best performance when only loading 60-70% of the cores
What do you mean by performance? Do you mean you achieve the highest throughput? Do you mean you achieve the lowest response times? Do you mean something else?
The early implementations of hyper-threading on Intel processors sometimes ran into trouble depending upon the workload. Enabling hyper-threading actual hurt performance and throughput. A lot of people quickly learned to disable hyper-threading. They are so entrenched in that decision that it is hard to help them see that hyper-threading is actually beneficial now.
The Linux thread scheduler is smart enough to put 1 thread on each physical core first and then double up on physical cores. So, I am not surprised that loading 60-70% cores yields best performance on the above mentioned processors. This creates a few more threads than physical cores which in a way disables hyper-threading.
Later implementations of hyper-threading improved considerably. I am not aware of any workloads which perform worse with hyper-threading enabled. With a modern processor (i.e. Westmere or newer), it would be interesting if you ran your workload with hyper-threading enabled and disabled. Then find the optimal thread count for each configuration. If hyper-threading disabled performs better, then that definitely would be an interesting workload and result.
Nathan Reynolds<http://psr.us.oracle.com/wiki/index.php/User:Nathan_Reynolds> | Architect | 602.333.9091
Oracle PSR Engineering<http://psr.us.oracle.com/> | Server Technology
On 2/12/2013 2:18 PM, √iktor Ҡlang wrote:
On Tue, Feb 12, 2013 at 8:28 PM, Kirk Pepperdine <kirk at kodewerk.com<mailto:kirk at kodewerk.com>> wrote:
> Do you agree that thread pool sizing depends on type of work? (IO bound vs CPU bound, bursty vs steady etc etc)
> Do you agree that a JVM Thread is not a unit of parallelism?
> Do you agree that having more JVM Threads than hardware threads is bad for CPU-bound workloads?
No, even with CPU bound workloads I have found that the hardware/OS is much better at managing many workloads across many threads than I am. So a few more threads is ok, many more threads is bad fast.
That's an interesting observation. Have any more data on that? (really interested)
As I said earlier, for CPU-bound workloads we've seen the best performance when only loading 60-70% of the cores (other threads exist on the machine of course).
Director of Engineering
Typesafe<http://www.typesafe.com/> - The software stack for applications that scale
Concurrency-interest mailing list
Concurrency-interest at cs.oswego.edu<mailto:Concurrency-interest at cs.oswego.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Concurrency-interest