[concurrency-interest] Should old ForkJoinWorkerThread die if starting a new thread fails?

Alex Otenko oleksandr.otenko at gmail.com
Tue Jun 6 15:44:59 EDT 2017


What is the failure model here?

1. Provisioned a limit of N threads
2. Consumed N threads
3. Creation of N+1 thread fails

What is the limit of N threads based on? Should the thread pools be sized instead to not exceed N? Is the target software creating threads outside pools?

I think there may be a few good engineering practices to take into account first. OOME on thread creation means someone did not heed those practices: did not size the environment, did not size pools, spawned non-pooled threads.

Alex

> On 6 Jun 2017, at 16:42, Nathan and Ila Reynolds <nathanila at gmail.com> wrote:
> 
> It seems that we have semantic overload here.  There are many factors which could prevent a new thread from being created.  One such factor is that there is no address space in the process to create the thread's stack.  Another such factor is that the process has too many threads.  It would be great if different exceptions could be thrown based on the actual condition.  This would make it easier to diagnose the problem.  It would also allow for the code to catch OutOfThreadHandlesException and simply run with the existing threads in the pool.
> 
> I realize that this is going to be tricky since each OS has its own set of thread creation problems.  Mapping the disparate sets of problems into similar meaningful exceptions is going to take a lot of thought.  Perhaps, someone can collect the various reasons why thread creation could fail for each OS, then then a group can figure out how to map them to exceptions.
> 
> -Nathan
> 
> On 6/6/2017 9:32 AM, Jarkko Miettinen wrote:
>> Hi,
>> 
>> This does seem like something that would've been discussed before here, but I could not find anything in the archives or a bug report.
>> 
>> In any case, currently if starting a new thread in ForkJoinPool#createWorker fails with an exception (OutOfMemoryError being the most common),  the thread that tries to start that new thread dies too. In specific situations this can lead to all threads in the ForkJoinPool dying out which does seem strictly worse than running just those threads and not spawning new ones.
>> 
>> I think OutOfMemoryError is generally be considered something that should not be recovered from. But might we here make a different choice as Thread#start can throw an OOM if it runs into process limits that prevent starting new threads (why, oh why). This also happens in very tightly controlled situation and we might want to just continue working on the tasks. At least if Thread#start has not been overridden.
>> 
>> As code in ForkJoinPool is a bit dense, I am not quite sure what are the exact required conditions. I just know that there should be both tasks in the pool and still be room for additional threads in the pool.
>> 
>> The problem will then manifest in stack traces such as this (Oracle JDK 1.8.0_92):
>> 
>> Exception in thread "ForkJoinPool-3983-worker-33" java.lang.OutOfMemoryError: unable to create new native thread
>>        at java.lang.Thread.start0(Native Method)
>>        at java.lang.Thread.start(Thread.java:714)
>>        at java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1486)
>>        at java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1517)
>>        at java.util.concurrent.ForkJoinPool.signalWork(ForkJoinPool.java:1634)
>>        at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1733)
>>        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1691)
>>        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> 
>> The little I looked in the latest jsr166 version in the CVS, the situation seems to be the same even if the methods have changed quite a bit.
>> 
>> My question is: Is there any way to prevent this and would such prevention would be beneficial in some or all cases?
>> 
>> At least naively it would seem that Thread#start fails with OOM, we could just return false and let the existing thread continue. But this probably is not something that's always wanted and can mask other, more serious OOMs.
>> 
>> -Jarkko
>> 
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> 
> -- 
> -Nathan
> 
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest



More information about the Concurrency-interest mailing list