[concurrency-interest] Thread Allocation

Ariel Weisberg ariel at weisberg.ws
Wed Feb 13 13:43:57 EST 2013


On Linux most people are not directly issuing disk writes. They are
modifying pages in the page cache and then Linux is flushing them
asynchronously in the background. Depending on the IO scheduler in play
you will get different scheduling and coalescing behaviors and the
actual disk and/or disk controller will also repeat that process. If
virtualization is in play you get yet another layer of IO indirection
and I hear the noop scheduler is the way to go in guest kernels.

Writes pretty much never block unless you are bursting beyond the
capacity of the page cache to absorb writes or you are being
backpressured because the disk can't keep up. Writes also won't block
on the disk unless there is a write barrier because disks and
controllers also have write caches.

Writes can block if the page you are modifying is not in the page
cache. If you always overwrite entire pages or append to a file you
usually won't have a problem.

Databases like to use O_DIRECT to bypass the page cache and manage this
stuff themselves because they understand that not all disk pages
contain the same kind of data.

I have found that using multiple threads to do blocking reads with NIO
works fine and I have no problem getting hundreds of thousands of reads
when the data is in the page cache. With an SSD I have no problem
getting the advertised number of random IOPs when the dataset is 5x

You can run hdparm -I to find out the queue depth of your disk and use
that to size the thread pool doing blocking random reads/writes. For
throughput the exact number doesn't matter much as long as there is a
reasonable amount of parallelism.

If the workload is mostly sequential it is better to break it up into
serial chunks and do your own scheduling. Pull/push 100 megabytes at
time and always have exactly one outstanding task pulling/pushing the
next 100 megabytes to keep the disk(s) busy. If you don't schedule
yourself throughput drops as the execution of the reads is interleaved
at too fine a granularity.

You also need to take into account pre-fetching at the kernel and disk
level, with an SSD this will kill you. If there is prefetching of data
you don't want it will pollute the page cache with irrelevant pages and
with an SSD it will kill the number of small random IOs you can do.



On Wed, Feb 13, 2013, at 09:50 AM, Nathan Reynolds wrote:

I have heard that building your system to deal with asynchronous I/O
will perform much better.  With synchronous I/O, the thread blocks and
waits.  This incurs 2 context switches as well as ties up a thread and
all of its resources.  With asynchronous I/O, the thread submits the
I/O request and continues to do other processing.  When the I/O
completes, a thread picks up the result and continues processing.
Asynchronous I/O allows 1 thread to submit enough I/O requests that the
underlying storage system can optimize how the requests are stored.
For example, a bunch of random writes using synchronous I/O will cause
the disk head to seek wildly since each write has to go to a different
location.  A bunch of random writes using asynchronous I/O will give
the underlying system a chance to sort the writes and have the disk
head make a single pass over the disk.  Disk performance will greatly

[1]Nathan Reynolds | Architect | 602.333.9091
Oracle[2]PSR Engineering | Server Technology
On 2/13/2013 2:34 AM, Chris Vest wrote:

If you can tell before hand which tasks (that you submit to the thread
pools) are going to be IO bound and which are going to be CPU bound,
then you can have to separate thread pools: a big one for the IO bound
tasks and a small one for he CPU bound ones.

Otherwise I'd say just set a high upper bound (upwards hundreds, but
depends on expected distribution) and let the OS manage things, see how
that works and if its performant enough, then you're done.

Note that I have no idea what kind of performance is expected of your
SIEM system.


On 13/02/2013, at 09.48, "Pete Haidinyak" <[3]javamann at cox.net> wrote:

I have a question on how to allocate Threads. I am creating a SIEM
which is a bunch of independent Java Services. The most likely use case
is this will run on one 2U box. The box will have two quad core Xeon
processors and 32G of RAM. Some of the Services will be I/O bound but
some will be CPU bound.
  In one of the latest discussion it was mentions that you should
allocate a Thread for each core (plus or minus a couple) for the best
throughput. I have the ability to turn the Thread Pools after startup
based on the number and types of Services running on the box.

My question is what would be the best way to allocate Threads when you
have multiple processes competing for resources?




Concurrency-interest mailing list

[4]Concurrency-interest at cs.oswego.edu


Concurrency-interest mailing list
[6]Concurrency-interest at cs.oswego.edu[7]http://cs.oswego.edu/mailman/listinfo/co


Concurrency-interest mailing list

[8]Concurrency-interest at cs.oswego.edu



1. http://psr.us.oracle.com/wiki/index.php/User:Nathan_Reynolds
2. http://psr.us.oracle.com/
3. mailto:javamann at cox.net
4. mailto:Concurrency-interest at cs.oswego.edu
5. http://cs.oswego.edu/mailman/listinfo/concurrency-interest
6. mailto:Concurrency-interest at cs.oswego.edu
7. http://cs.oswego.edu/mailman/listinfo/concurrency-interest
8. mailto:Concurrency-interest at cs.oswego.edu
9. http://cs.oswego.edu/mailman/listinfo/concurrency-interest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20130213/44f50daf/attachment.html>

More information about the Concurrency-interest mailing list