[concurrency-interest] The need for a default ForkJoinPool

Gregg Wonderly gregg at cytetech.com
Tue Aug 17 16:47:22 EDT 2010


David Holmes wrote:
> If threads explicitly attached and detached from libraries then a
> Threadlocal might be appropriate, but in general I think it is a very bad
> mechanism because the right pool is a function of the library not of the
> thread using it!

Work dispatched directly from a FJTask, or otherwise with explicit knowledge of 
the pool it is running in, has no problem staying inside of the environment it 
was launched into.

The original question was about how to generalize code to not have explicit 
knowledge, but rather implicit knowledge of what context (FJPool) it might use.

A Thread of execution represents a code path which ultimately can end up running 
through unknown software libraries.  Container architectures such as JEE, JME, 
Applet and others have ended up wrapping all the context into some instance of a 
class/interface which provides the global view which can help compartmentalize 
environment of execution so that there can be very clean separation between 
"code" and "environment".

Having to "pass" a FJPool around presupposes that this is the only method of 
concurrency control that will be applicable, and it means that you have to know 
ahead of time that one is needed, and that, ultimately is not possible.

So, establishing an out of band mechanism for selecting a pool, would provide 
much easier deployment control.  DI can be a useful mechanism, and I'm all for 
using that technique where possible.  But, DI comes in multiple flavors, and 
ones own choice may not mesh with, or interact well with the mechanisms of another.

So, I'm still interested in what might be possible to have FJ provide some level 
of out of band knowledge.  I'm thinking about something like the following in 
FJPool.

public static FJPool getThreadsPool() {
	for( FJPool pool : allPools.elements() ) {
		if( pool.contains( Thread.currentThread() ) != null ) {
			return pool;
		}
	}
}

This would look through active instances to see if the current thread is 
dispatched from that pool.  If it is, then return the pool.  This would then 
allow a pretty low level in the software structure to find the environment it 
might use for dispatching work, without having to pass around such details 
through countless APIs.

> I think libraries must allow for a default pool while also supporting an
> explicit pool.
> 
> I think what differentiates this case from other global "thread pools" is
> that the scope of usage is potentially much broader and that the pool will
> not necessarily grow as demand increases.

Scaling is always a challenging detail.  But, providing some explicit 
controlling effects, by disallowing a low level flow of execution to "raise" the 
performance of it's work, might be a very desirable thing.

For example, it might make sense to add Permission checks to FJPool creation to 
limit thread pool creation.  With the method above, a thread of execution could 
then discover what environment it should submit additional work to.  This would 
help with DOS activities on mobile code applications by allowing them only 
marginal opportunity to abuse the machine with thread creation.

> I also think, and perhaps it is just my misunderstanding here, that having
> multiple simultaneous users of a FJPool is going to be counter-productive. I
> can get good speedup doing a parallel sort of one huge array, but if I try
> to sort two huge arrays at the same time then I'm just stepping on my own
> toes. If a pool processes one task then all stealing aids that task; with
> multiple tasks you really have no idea how long each one will take.
> 
> Which would perform better: one pool of N thread processing 2 arrays, or 2
> pools of N/2 threads processing 1 array each?

I don't live in a world of "fixed work loads".  Everything I do involves network 
oriented distributed systems with inter-machine, remotely dispatched activities. 
  I need ways to confine such applications to limited thread resources more than 
I need to provide an ultimate sorting engine.

Not everything done in such systems have short running threads of execution with 
out delays/pauses.  So, I still have to manage some threading explicitly.  But 
for certain applications and work, FJPool is a big help, and I, like Kasper, am 
trying to figure out a good "container view" for my applications which would 
make passing of the FJPool a lot less necessary.

Gregg Wonderly

> David Holmes
> 
>> -----Original Message-----
>> From: concurrency-interest-bounces at cs.oswego.edu
>> [mailto:concurrency-interest-bounces at cs.oswego.edu]On Behalf Of Gregg
>> Wonderly
>> Sent: Tuesday, 17 August 2010 3:32 AM
>> To: Kasper Nielsen
>> Cc: concurrency-interest at cs.oswego.edu
>> Subject: Re: [concurrency-interest] The need for a default ForkJoinPool
>>
>>
>> It seems like there should be a default to use which falls out of
>> the context of
>> the executing thread too.  Using a ThreadLocal, as here, helps to
>> keep previous
>> 'knowledge" of the pool intact for subsequent uses without all
>> the burden of
>> maintaining all the context necessary for every point of
>> execution to know what
>> the previous did.
>>
>> Gregg Wonderly
>>
>> Kasper Nielsen wrote:
>>> Im just throwing this into the discussion, it might be really bad idea.
>>> Haven't thought it through completely.
>>>
>>> This is primarily thought to be a way for applications servers such as
>>> Tomcat/Weblogic/Websphere to have better control over which threadpools
>>> should be used in case we decide on a single system-wide threadpool.
>>>
>>> Again i really want to just expose a simple api such as
>>> parallelSort(int[] array) to users. Now, for example, when I'm using
>>> Weblogic i often use its build-in functionality for prioritizing
>>> requests. So request from user A uses high priority Threadpool 1, and
>>> requests from user B uses low priority Threadpool 2. If both users
>>> requires calls to parallelSort they effectively have the same priority
>>> because they both use the single shared threadpool.
>>>
>>> Something like this would allow complete control to containers.
>>>
>>> public class ForkJoins {
>>>
>>>     private final static ThreadLocal<SoftReference<ForkJoinPool>>
>>> FJP_THREAD_DEFAULT = new ThreadLocal<SoftReference<ForkJoinPool>>();
>>>     private final static ForkJoinPool DEFAULT = new ForkJoinPool();//
>>> not lazy for simplicity issues
>>>
>>>     public static ForkJoinPool get() {
>>>         SoftReference<ForkJoinPool> sr = FJP_THREAD_DEFAULT.get();
>>>         if (sr != null) {
>>>             ForkJoinPool fjp = sr.get();
>>>             return fjp != null ? fjp : DEFAULT;
>>>         }
>>>         return DEFAULT;
>>>     }
>>>
>>>     public static void setThreadDefault(ForkJoinPool fjp) {
>>>         //Security checks
>>>         FJP_THREAD_DEFAULT.set(new SoftReference<ForkJoinPool>(fjp));
>>>     }
>>> }
>>>
>>> Cheers
>>>   Kasper
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> 
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
> 
> 



More information about the Concurrency-interest mailing list