[concurrency-interest] Awaiting a set of tasks on an ExecutorService

Shevek shevek at anarres.org
Fri Aug 18 16:54:00 EDT 2017

I can't use CountDownLatch because of the streamy nature of the source. 
I have no idea how many tasks there are going to be up front. I read an 
incoming stream, it turns out there's about 50 million in it, and this 
is going to go up by the usual orders of magnitude.

I could use a List of CountDownLatches each with ... but then I rapidly 
get into "Someone cleverer than me MUST have solved this before" territory.

I can do my own tricks with an AtomicLong and wait()/notify(), but I 
have to make sure the master thread calls get() on the relevant Future 
each time a job finishes, otherwise there isn't a happens-before 
relationship with the master thread, so I'd need the job to put its own 
Future onto a Deque, or something...?

The tasks all return Void, but it's nice to collect the first few 
exceptions (currently there are about 750,000 total exceptions thrown).


On 08/18/2017 01:48 PM, Josh Humphries wrote:
> I think the easiest thing would be to decorate each task to call 
> "latch.countDown()" on a CountDownLatch that is initialized with the 
> total number of tasks. After they are all submitted, the code that wants 
> to wait for them to finish would simply await the latch. This decouples 
> completion of all tasks from the actual ExecutorService that is running 
> them, so you can share the same ExecutorService for multiple, even 
> overlapping, sets of tasks.
> ----
> *Josh Humphries*
> jhump at bluegosling.com <mailto:jhump at bluegosling.com>
> On Fri, Aug 18, 2017 at 4:42 PM, Shevek <shevek at anarres.org 
> <mailto:shevek at anarres.org>> wrote:
>     Hi,
>     I need to build a complicated object on a multiprocessor system. The
>     target object uses synchronization to guarantee thread-safe access.
>     However, the individual tasks which build elements of the target
>     object are expensive, so I farm them out to an ExecutorService. The
>     question is:
>     What is the recommended way to submit a stream of tasks to an
>     ExecutorService, and then at a point, wait until they're all done.
>     * I don't know the list of tasks up front. I read them iteratively
>     from a set of files or events.
>     * A big collection of Future(s) isn't realistic, there are just too
>     many tasks.
>     * Reaping completed futures on submission means that I end up with
>     exceptions in weird places; I really need to gather max 100
>     suppressed exceptions, and discard the rest.
>     * ForkJoinPool has invoke() but I think for a million-task job, I
>     still end up with a huge list of futures. This also assumes I ignore
>     the note about synchronization.
>     * CompletionService allows me to wait for _any_ of the submitted
>     tasks, but not for _all_ of them.
>     * Bonus points for sharing a single ExecutorService and having
>     "sets" of tasks which can be independently awaited. This starts to
>     sound very like FJP, except for the blocking/I/O and the stream-ness
>     of my task set.
>     Mostly, dear list, I'm expecting a one-liner from some API class
>     that I've missed because this is NOT an original request, I just
>     can't find a standard solution.
>     Thank you.
>     S.
>     _______________________________________________
>     Concurrency-interest mailing list
>     Concurrency-interest at cs.oswego.edu
>     <mailto:Concurrency-interest at cs.oswego.edu>
>     http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>     <http://cs.oswego.edu/mailman/listinfo/concurrency-interest>

More information about the Concurrency-interest mailing list