[concurrency-interest] CompletableFuture in Java 8

Josh Humphries jh at squareup.com
Fri Dec 5 11:07:32 EST 2014


On Fri, Dec 5, 2014 at 8:41 AM, √iktor Ҡlang <viktor.klang at gmail.com> wrote:

> Hey Josh,
>
> On Fri, Dec 5, 2014 at 5:30 AM, Josh Humphries <jh at squareup.com> wrote:
> >
> > Hey, Viktor. I think I've touched on some of this already. But, since
> you said you're very much interested, I'll elaborate on my thinking.
> >
>
> Thanks for taking the time and spending the effort to elaborate, Josh, I
> really appreciate it!
>
> Rereading my reply I notice that we have strayed a bit from the initial
> discussion, but it is an interesting topic so I'll share my thoughts on the
> topic.
>
> TL; DR: I think we both agree but have different cutoff points. :)
>

Indeed. I think that sums up the difference pretty well :)


> > Every decision is a trade-off. Mixing concerns can bloat the API and
> increase the cognitive burden of using it, but it can also provide greater
> functionality or make certain patterns of use easier. While the two
> concerns we're discussing may seem similar, they are very different (at
> least to me) regarding what they are actually providing to the developer,
> so the trade-offs are different.
>
> Agreed. My stance is to error on the side of Single Responsibility
> Principle, and it is easier to add API if defensible than deprecate and
> remove (remove has never happened in the JDK AFAICT).
>
> >
> > Concern #1: Exposing methods to imperatively complete the future vs.
> having the future's value be provided implicitly (by the running of some
> unit of logic). We're not really talking about mixing the two here. My
> objection was that CompletionStage#toCompletableFuture leaks the imperative
> style in a way that is simply inappropriate.
>
> I think both I and Doug(?) agree here, the problem is that there's no
> protected scope for interface methods, so CompletableFuture would have to
> wrap every CompletionStage that isn't a CompletableFuture, leading to a lot
> of allocations for the worst case. Doug would be able to share more about
> that.
>

But CompletableFuture is a class, not an interface. So if the
CompletionStage is not a CompletableFuture, you're still back to having to
wrap the CompletionStage. You've just moved the responsibility out of
CompletableFuture and into every other implementation of CompletionStage,
which seems like an unusual choice.

>
> > So my objection here is about poor encapsulation/abstraction. If the API
> had returned FutureTask, that too would have been bad. (I also griped about
> the lack of a FutureTask-like implementation of CompletionStage, but that
> is really a nit; not a major complaint.)
>
> Personally I don't mind here, it's beyond trivial to "submit" a
> CompletableFuture. But YYMV.
>
> And with CompletableFuture you have a choice if you want to expose it,
> CompletionStage or Future depending on what capabilities you want to
> expose, which does sound quite flexible?
>

Sure. I had prefaced my whole original rant with the fact that these are
nits and acknowledged that everything that is needed is there. In fact,
I've written everything I need on top of the existing APIs (but found it
annoying that my implementation of CompletionStage has a method that
requires wrapping the stage in a CompletableFuture).

So my points are really about the aesthetics of the API (which, admittedly,
are often subjective).

> >
> > As far as inter-op with legacy APIs, a #toFuture() method would have
> been much better for a few reasons:
>
> I think again that the toCompletableFuture, as far as I can see, was
> primarily needed for CompletableFuture.
>
> >
> > Future is an interface, so a *view* could be returned instead of having
> to create a new stateful object that must be kept in sync with the
> original.
> >
> > Future provides inter-op but doesn't leak complete*/obtrude* methods
> (the heart of my objections)
> > It could have been trivially implemented as a default method that just
> returns a CompletableFuture that is set from a #whenComplete stage, just as
> you've described.
> > (I'm pretty certain we agree on #1. At least most of it.)
>
> I would have much preferred to have a static method on Future called
> "fromCompletionStage" so that CompletionStages do not need to know about
> "the world". :-)
>
That, too, is completely reasonable :)
I think the discoverability point has already been brought up. But that's
what @see tags in Javadoc are for, right?


> >
> > You've already clearly expressed the opinion that blocking code is never
> appropriate. I think that's a reasonable assertion for many contexts, just
> not the JRE. Avoiding blocking altogether in core Java APIs is neither
> realistic nor (again, IMO) desirable.
>
> Would you mind expanding on "just not the JRE"?
> My view is that java.util.concurrent is about tools to facilitate
> concurrent programming mainly targeted towards advanced users and library
> writers.
> Perhaps it is here we have different views?
>
I'll back-pedal a little and qualify "not in the JRE" with "not in certain
parts of the JRE".

Many of the APIs in that package really *are* for advanced users and
library writers, just like you said. But ExecutorService,
ScheduledExecutorService, and Future are really basic and are the kind of
building blocks that even developers writing business logic will need/want
to use. There are plenty of cases where it makes sense to provide
purpose-built libraries that wrap them. But there are plenty that don't
deserve that treatment, too. So. for the latter, I think these APIs in
particular need to be approachable and flexible.

You've already mentioned that you'd like to see ExecutorService retired. It
definitely has plenty of sharp corners. But I'll hold judgement on whether
or not its use should be retired for when I see its replacement (sorry,
FJP, you're not it.)

> >
> > There is a spectrum. On one end (let's call it "simple"), you want a
> programming model that makes it easier to write correct code and that is
> easy to read, write, understand, and troubleshoot
> >
> > (at the extreme: all synchronous, all blocking flows -- very simple to
> understand but will often have poor performance and is incapable of taking
> advantage of today's multi-core computers). On the other end
> ("performance"), you want a programming model that enables taking maximum
> advantage of hardware, provides greater efficiency, and facilitates better
> performance (greater throughput, lower latency).
>
> I think you may be conflating "simple" and "easy":
> http://www.infoq.com/presentations/Simple-Made-Easy
>
> To me, personally, it is mostly about performance, because that's what I
> need. But for my users, it is important that one can reason about how the
> code will behave.
>
> I'll argue that async monadic-style programming is -simpler- than the
> blocking equivalent, yes, it may sound extremely weird at first thought but
> hear me out:
>
I agree that this style makes certain things simpler -- *and* easier :)

That's why I like CompletionStage and wrote an adapter for users that want
to take advantage of this style with our (still on Java 7) frameworks.

(It's also why we use ListenableFuture exclusively, never plain ol'
j.u.c.Future.)


> Let's take these two sections of code:
>
> def addSync(f1: j.u.c.Future[Int], f2: j.u.c.Future[Int]): Int = f1.get()
> + f2.get()
>
> Questions:
> 1) When is it safe to call `addSync`?
> 2) How do I, as the caller of `addSync` know when it is safe to call
> `addSync`?
> 3) Will my program be able to run to completion if I call `addSync`?
> 4) How much code do I need to change if `addSync` causes performance or
> liveness problems?
>
def addAsync(f1: AsyncFuture[Int], f2: AsyncFuture[Int))(implicit e:
> Executor): AsyncFuture[Int] = f1 zip f2 map {_ + _}
>
> Questions:
> 1) When is it safe to call `addAsync`?
> 2) How do I, as the caller of `addAsync` know when it is safe to call
> `addAsync`?
> 3) Will my program be able to run to completion if I call `addAsync`?
>
> In my experience (as a contributor to Akka for 5 years, and the co-author
> of Futures & Promises for Scala), the biggest risk with adding blocking
> APIs (Akka Futures had blocking APIs -on- the Future itself, Scala has it
> externally on a utility called Await, which fortunately employs managed
> blocking to try to reduce the risk of liveness problems at the expense of
> performance) I can safely say that most people will fall back on what they
> know, if that is easier (less effort) than learning something new. There's
> nothing -wrong- about that, it's just human nature! However, knowing that,
> we must take that into consideration and make it easier (less of an effort)
> to learn new things, especially if it leads to better programs
> (maintainability, performance etc).
>
> When the blocking methods were built into the Future itself (in Akka
> originally), it was one of the biggest sources of problems reported
> (related to Futures).
> When the blocking methods were externalized (in scala.concurrent), it is
> still one of the biggest sources of problems reported (related to Futures).
>
> Again, this is just my experience on the topic, so YMMV!
>
> >
> >
> > If we're being pragmatic, economics is the real decider for where on the
> spectrum will strike the right balance. On the simple side, there's an
> advantage to spending less in engineer-hours: most developers are more
> productive writing simple synchronous code, and that style of code is much
> easier to debug. But this can incur greater capital costs since it may
> require more hardware to do the same job. On the performance side, it's the
> converse: get more out of the money spent on compute resources, but
> potentially spend more in engineering effort. (There are obviously other
> constraints, too, like whether a particular piece of software has any value
> at all if it can't meet certain performance requirements.)
>
> I understand and can sympathize with this view. But I think that it is
> more complex than that, it is essentially trading away quick short term
> gain for long-term loss. (Debugging deadlocks and concurrency issues more
> than often cost more in developer time, and disrupting production systems
> than the gain in initially writing the code.)
>
> "It is easier to serve people desserts than greens, the question is what
> is healthier." :)
>
>
I agree with most of this. I'm aware of the liveness pitfalls with
blocking, but the frameworks our app developers use and the places where we
lean on ListenableFuture don't really encounter these scenarios. There are
other potential hazards for sure. But, for us, it's been worth it. We don't
ever have app developers running into concurrency/liveness problems that
arise from blocking.

Though *I* occasionally have fun debugging that kind of stuff, in the
bowels of frameworks :)

It's why I I try to use non-blocking concurrency techniques wherever
possible. (But I work at a lower-level than I expect my users / app
developers to work.) For practical reasons, there are plenty of places
where simple blocking techniques (like synchronized blocks) were used
because it sufficed. (And I get to go debug it and refactor it if we get to
a point where it no longer does.)

> >
> >
> > My experience is that most organizations find their maxima close to the
> middle, but nearer to the simple side. So there is an economic advantage
> for them to focus a little more on developer productivity than on software
> efficiency and performance.
>
> For the user, I worry more about correctness and liveness than
> performance. The performance concerns is for me as a library writer (as the
> users can't opt out of bad library performance).
>
>
Agree completely. Except I haven't encountered the same level of problems
with deadlock arising from blocking code so maybe that's why we're split on
that point. Perhaps there are other aspects of architectures I've worked
with that have limited my exposure to it, or maybe my time just hasn't come
yet :)

>
> >
> > I want my users to be as productive as possible, even if it means they
> write blocking code. (And, let's face it, some of them will commit
> atrocities far worse than just using a blocking API.)
>
> I understand this line of reasoning but it always has to be qualified.
> For example, the allure of RoR was that of high initial productivity, but
> what was sacrificed was both performance, maintainability and scalability.
> So we need to not only consider short term "gains" but also long term
> "losses".
>
Absolutely. But I feel I need to be careful in limiting the "easy" choices
just because they lead to technical debt, because that is often totally
appropriate (as long as you understand the trade-offs and acknowledge when
you're accumulating technical debt).

RoR is a great example. Many companies today probably wouldn't exist if it
weren't for RoR. Many have had to undergo major architectural change to
later achieve those three things (performance, scalability, and
maintainability). But for plenty (including Square), it was absolutely the
right decision in order to get a product out quickly and at low cost. That
velocity, even if it means future technical pain, can be critical for a
young business in order to get early feedback and validate their business
model as well as start generating revenue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20141205/2f9dda5a/attachment-0001.html>


More information about the Concurrency-interest mailing list