[concurrency-interest] Context-switching VS Continuation (long reply)

Tim Peierls tim at peierls.net
Thu Jun 9 11:19:23 EDT 2005


I like this approach a lot, and I've been working off and on to build
something that sounds very similar. I was prompted to attempt this after
reading, on Doug Lea's recommendation (how else does anything happen?), the
Larus and Parkes paper on cohort scheduling:

  http://research.microsoft.com/users/larus/Papers/usenix02_cohort.pdf

As with your approach, #threads == #cpus, and there is no context-switching
(as long as you stick to the rules).

In my formulation, a Stage is an extension of AbstractExecutorService
that overrides the submit methods to covariantly return AwaitableFuture<T>,
and Continuation is a special Callable whose additional invoke() method
arranges to schedule the Continuation when a given list of AwaitableFutures
have all completed.

You can either extend Stage or, as in this (uncompiled!) sketch, wrap a
Stage instance with the stage logic. This example assumes a pipeline where
URL first resolves to Image, then Image is rendered into ImageBuf.

   class ImageResolver {
       public AwaitableFuture<Image> resolve(final URL url) {
           return stage.submit(new Callable<Image>() {
               public Image call() throws Exception { ... }
           });
       }
       private final Stage stage = ...;
   }

   class ImageRenderer {
       public AwaitableFuture<ImageBuf> renderToBuffer(final URL url) {
           return stage.submit(new Callable<ImageBuf>() {
               public ImageBuf call() throws Exception {
                   AwaitableFuture<Image> futureImage =
                       imageResolver.resolve(url);
                   return new Continuation<ImageBuf>(stage, futureImage) {
                       public ImageBuf call() throws Exception {
                           return reallyRender(futureImage.get());
                       }
                   }.invoke();
               }
           });
       }
       private ImageBuf reallyRender(Image image) {...}
       private final Stage stage = ...;
       private final ImageResolver imageResolver = ...;
   }

   // some caller
   AwaitableFuture<ImageBuf> futureBuf =
       imageRenderer.renderToBuffer(url);
   ...
   // in a continuation:
       Graphics g = ...;
       ImageBuf imageBuf = futureBuf.get();
       display(g, imageBuf);

There is some trickiness and performance optimization involving that
Continuation.invoke() method. If the AwaitableFutures are ready by the time
invoke() is called, the continuation just runs immediately and invoke()
returns the result value normally. If not, the invoke() method returns
null, and the Stage machinery returns an AwaitableFuture that will be ready
when the Continuation is eventually run.

As Larus and Parkes describe, each thread has a per-thread stack (an
ArrayDeque) for local invocations and a per-thread queue for sending work
to other threads. The stack is drained first, then the queue.

Stages can be "partitioned" so that method invocations are dispatched to
specific threads based on some key, as in Larus and Parkes. This can be
used, for example, to eliminate the need for locking partitioned data
structures.

Just for comparison, here's what the plain ol' context-switching version
might look like:

   class ImageResolver {
       public Future<Image> resolve(final URL url) {
           return exec.submit(new Callable<Image>() {
               public Image call() throws Exception { ... }
           });
       }
   }

   class ImageRenderer {
       public Future<ImageBuf> renderToBuffer(final URL url) {
           return exec.submit(new Callable<ImageBuf>() {
               public ImageBuf call() throws Exception {
                   Future<Image> futureImage =
                       imageResolver.resolve(url);
                   return reallyRender(futureImage.get());
               }
           });
       }
       private ImageBuf reallyRender(Image image) {...}
   }

   // some caller
   Future<ImageBuf> futureBuf =
       imageRenderer.renderToBuffer(url);
   ...
   Graphics g = ...;
   ImageBuf imageBuf = futureBuf.get();
   display(g, imageBuf);


Dawid Kurzyniec wrote:
> Somehow I doubt that it could outperform normal context switching. After
> all, the "continuation switching" is a lot like context switching
> implemented at the user level. Why would it be faster than switching
> performed close to the hardware, where the amount of state is smaller
> and also you can take advantage of kernel-mode, fine-tuned assembly
> language, etc? Also, one of performance hits related to context
> switching is caused by cache flushing. You will not avoid it anyway.

You could be right about that, but see if the Larus and Parkes paper makes
you feel differently. I don't have enough experience with this framework
yet to say anything conclusive.

--tim


Jean Morissette wrote:
> Hi all, I would be happy to have your comments on this...
> 
> First, we are building a SEDA-based framework, http://jcyclone.sf.net,
> which is designed to support massive degrees of concurrency.  Our
> framework is a little bit like a graph of executors (Stages) linked by
> queues.
> 
> Actually, we are working on an innovative (or weird) continuation-based
>  scheduler that eliminate context-switch.  The idea is to instrument 
> user-provided classes (event handlers) in such a way that the thread
> context (call stack, etc) is saved in a continuation object when a
> thread encounters a wait(), and can be restored later.  When notify() is
> called on the monitor, a waiting continuation is put in the ready queue
> of the scheduler.
> 
> So, by creating as much threads as cpus, there is no context-switch at
> all. However, the cost of saving the thread context can be significant
> if the stack is very deep.  To overcome this, our scheduler use some
> statistiques to schedule threads in such a way that
> "continuation-switchs" occurs very rarely.
> 
> We believe that this design will outperform our the conventional 
> "thread-pool / OS scheduler" design because we can use high-level 
> informations (stages stats) that the OS scheduler don't have.  I'm very
>  interested to have your opinions about this continuation-style design.
> 
> Thanks, -Jean







More information about the Concurrency-interest mailing list