[concurrency-interest] Soliciting input about removeAllThreadLocals

Eric D Crahen crahen@cse.Buffalo.EDU
Fri, 6 Dec 2002 08:34:53 -0500 (EST)


On Fri, 6 Dec 2002, Doug Lea wrote:

> The question is, which is worse,
>   (1) unexpected reuse,
> or
>   (2) unexpected clearing.

> Specifically, is (2) such a serious problem that removeAllThreadLocals
> shouldn't be supported, forcing people to live with (1), or to not use
> ThreadExecutors when this problem arises?

I just grepped through the JDK (.java) sources, these were some of the
things that caught my eye. Some 3rd party libraries probably do similar
things.

The ORB (and other misc CORBA classes) supplied with JDK 1.4 use a bunch
ThreadLocals to keep track of various information (stuff ranging from
caching, synchronizing shutdown, debugging, to initializing things on
client & server ends).

The logging API uses ThreadLocals to keep track of assigned thread ids.
The logging classes would end up assigning multiple id's.

Anything that uses this kind of caching idiom,

Object o = null;

if((o == aThreadLocal.get()) == null) {

  o = // something;
  aThreadLocal.set(o);

}

things would have the potential break when an executor thread whose
threadlocals have been cleared runs through the code. After a quick look,
the CORBA classes appear to do this sort of thing.

One 3rd party package that comes to mind (only because I've got the
source sitting on this machine right now) that I know uses this sort of
thing is JBoss. It uses this to cache security attributes and transaction
times.

In that scenario, the best case when an Executor thread whose ThreadLocals
were wiped out runs through is that the secutiry descriptor has to be
fetched again. So no caching is done, and you do alot of extra work
figuring out credentials. It doesn't neccessarily break, but its bad for
performance.

In the transaction classes, ThreadLocals are used to track timeouts. If
Executor threads are running the transactions it looks like this would
have the potential to cause some problems there too.

Anything uses ThreadLocals to initialize or keep track of data, like this,

  static int counter = 0;

  // ...

  Integer id = null;
  if((id = aThreadLocal.get()) == null) {

    synchronized {
      id = new Integer(counter++);
    }

    aThreadLocal.set(id);

  }


would break completely since the information kept in the ThreadLocal
would be forever lost. The LogRecord in 1.4 is an example of a class that
does something like this. It looks like the CORBA classes in 1.4 also do
this sort of thing.

If this method were added, one of the big consequences is that
the ThreadLocal caching idiom would no longer work for threads. So that
would affect performance by increasing overhead. Assuming
that if the value isn't in the ThreadLocal its recalculated or looked up
in some synchronized collection.

Another is that anything using ThreadLocals to keep initialize or track of
data would be broken.

If a ThreadLocal can lose its value at any time because of another Thread,
is a ThreadLocal really still useful?

Not using ThreadExecutors would be one way to avoid this, but I think it
would be a shame for the EG to have gone through all this trouble creating
a flexible general purpose thread package if executor threads can't rely
on ThreadLocals (and can't use the current ORB, or logging API or possibly
not be used in implementing things that could benefit from using TLS). It
would be very limiting when it comes to what kinds of tasks an Executor
can run.

A user would have to know alot of details about the classes a task is
using. That might not always be obvious. I could write a task
that uses the Logging API, or does some work with making some CORBA
requests, and my code would either suddenly not operate properly or it
could take a hit performance wise. Unless I have the source for whatever
library I'm using I wouldn't know the cause of the problem, and I wouldn't
know for sure if using that library in a task I intend to run with an
Executor is truely safe to use with an Executor or not.

I think the problem is that this solution is attempting to find a way
to abstract resetting the ThreadLocal variables that it by making some
assumptions about the implementations of all tasks. How those ThreadLocals
are used are always going to be an implementation detail of a task, hidden
away from the Executor by one or more levels of abstraction. It seems no
matter how one decides to implement something like  clearAllThreadLocals()
there is always going to be something it breaks because your Executor is
just too far removed from the knowledge about the ThreadLocals.

Perhaps, there is a better way to structure a task, which does have access
to this knowledge, so that these problems can be avoided. That seems like
a place for a solution to me. But I haven't got any suggestions right now
:)


- Eric
http://www.code-foo.com/