[concurrency-interest] spinLoopHint() JEP draft discussion

Hans Boehm boehm at acm.org
Tue Oct 13 21:38:56 EDT 2015


It seems to me that the trick here is to be explicit as to what is
intended.  Presumably this is intended to discourage speculative execution
across a spinLoopHint().  It is not intended to, for example, put the
processor into some sort of sleep state for a while, though that might also
make sense under slightly different circumstances.

I would emphasize that this is expected not to increase latency.  It might
happen to reduce power consumption, but a power-reducing,
latency-increasing implementation is not expected.

On Sat, Oct 10, 2015 at 8:41 AM, Gil Tene <gil at azul.com> wrote:

>
> On Oct 8, 2015, at 10:50 AM, Hans Boehm <boehm at acm.org> wrote:
>
> My question about spinLoopHint() would be whether it can be defined in a
> way that it makes it useful across architectures.  I vaguely remember
> seeing claims that even the x86 instructions are not implemented
> consistently enough to be easily usable in portable code.
>
>
> The PAUSE instruction on x86 has been around and used consistently since
> Pentium 4s. And pretty much anything spinning (including the JVM's own C++
> spinning code) uses it across all x86 architectures. (It encodes in a way
> that makes it a NOP for pre-Pentium 4 x86, so its harmless at worst).
>
>   I have no idea (though I probably should) about ARM equivalents or the
> like.
>
>
> It does not seem to be common practice to use a pure spin loop hinting
> instruction on ARM in spin loops. On ARMv8 (64 bit) spinning uses WFE/SEVL
> instructions, which do more than hint. They actually watch a specific
> memory location for change. See discussion in several e-mails on the thread
> with the same subject on OpenJDK core-libs-dev archives about that.
>
> It also seems to me that unbounded spin loops are almost always a bad idea.
>
>
> The hidden OS guy in me always feels that way. But in today's many-core
> world it is hard to argue with the many practical uses of dedicated and
> unbounded user-mode spinning. From kernel bypass networking stacks to
> messaging stacks to trading applications, it is VERY common to find a
> server continually spinning on a handful of cores these days. And it
> provides metric benefits to the applications that do so. These include many
> applications written in (and doing their spinning logic) in Java.
>
> (If you've been spinning for 10 seconds, you should be sleeping instead.
>
>
> Not if what you care about is the reaction time to the next message. Many
> applications care about latency (sometimes down to the sub-usec levels)
> even when messages only come in at 100/sec. And unbounded spinning improves
> latency across the board (not just the long tails, but even the medium) for
> such use cases.
>
> You might even be inadvertently scheduled against the thread you're
> waiting for.
>
>
> That's what is always dangerous about user-mode spinning (even the bounded
> kind). But there are many practical ways to prevent this from happening (or
> prevent it "enough") on modern many-core machines. Just keeping your active
> thread counts well below your vcore count is a pretty simple way to start
> for this, and with a modern 2 socket x86 server having anywhere from 24 to
> 72 vcores these days, thats a pretty practical thing to do. The true
> latency sensitive folks out there will do a lot to control which cores they
> spin on, and who might interfere with those cores (e.g. see this detailed
> Strageloop presentation by Mark Price from LMAX:
> https://www.youtube.com/watch?v=-6nrhSdu--s (discussion of core-affiny
> controls starts around 16:00 in the video). LMAX do a lot of spinning in
> Java…).
>
>   Since you're waiting anyway, you might as well keep track of how long
> you've been spinning.)  But the idea here would be that this is the
> low-level primitive you use if you haven't been spinning for very long?
>
>
> A spinHintLoop is useful for both short spinning (spinning for a while
> before giving up and blocking) and in indefinite spinning, nd both cases
> will benefit from it.
>
>   The alternative is to pass in some indication of how long you've been
> spinning, and have this yield, or sleep, after a sufficiently long time.
>
>
> I don't see much urgency for adding convenience wrappers, as this logic is
> doable without adding a Java SE APIs. In fact, it is common to see this in
> code that performs some sort of indefinite spinning logic.
>
> spinLoopHint() is needed because it provides a currently missing feature.
> Without it there is (currently) no way for Java spinning logic to make use
> of important hardware capabilities that improve execution metrics (latency,
> power consumption, and overall program throughout). Those capabilities are
> in near-universal use outside of Java for good reason, and Java just lacks
> a way to indicate the need or intent in a practical way (and JNI call or a
> yield() is not practical due to the dramatic relative cost difference)...
>
>
> Hans
>
> On Tue, Oct 6, 2015 at 6:41 PM, Gil Tene <gil at azulsystems.com> wrote:
>
>> When comparing spinLoopHint() to Thread.yield(), we're talking about
>> different orders of magnitude, and different motivations.
>>
>> On the motivation side: A major reason for using spinLoopHint() is to
>> improve the reaction time of a spinning thread (from the time the event it
>> is spinning for actually occurs until it actually reacts to it). Power
>> savings is a another benefit. Thread.yield() doesn't help with either.
>>
>> On the orders of magnitude side: Thread.yield involves making a system
>> call. This makes it literally 10x+ longer to react than spinning without
>> it, and certainly pulls in the opposite direction of spinLoopHint().
>>
>>
>> On Oct 6, 2015, at 1:15 PM, Nathan Reynolds <nathan.reynolds at oracle.com>
>> wrote:
>>
>> I am not fully up to speed on this topic.  However, why not call
>> Thread.yield()?  If there are no other threads waiting to get on the
>> processor, then Thread.yield() does nothing.  The current thread keeps
>> executing.  If there are threads waiting to get on the processor, then
>> current thread goes to the end of the run queue and another thread gets on
>> the processor (i.e. a context switch).  The thread will run again after the
>> other threads ahead of it either block, call yield() or use up their time
>> slice.  The only time Thread.yield() will do anything is if *all* of the
>> processors are busy (i.e. 100% CPU utilization for the machine).  You could
>> run 1000s of threads in tight Thread.yield() loops and all of the threads
>> will take a turn to go around the loop one time and then go to the end of
>> the run queue.
>>
>> I've tested this on Windows and Linux (Intel 64-bit processors).
>>
>> Some people are very afraid of context switches.  They think that context
>> switches are expensive.  This was true of very old Linux kernels.  Now a
>> days, it costs 100s of nanoseconds to do a context switch.  Of course, the
>> cache may need to be reloaded with the data relevant for the running thread.
>>
>> -Nathan
>>
>> On 10/6/2015 11:56 AM, Gil Tene wrote:
>>
>> A variant of synchronic for j.u.c would certainly be cool to have.
>> Especially if it supports a hint that makes it actually spin forever rather
>> than block (this may be what expect_urgent means, or maybe a dedicated spin
>> level is needed). An implementation could use spinLoopHint() under the
>> hood, or other things where appropriate (e.g. if MWAIT was usefully
>> available in user mode in some future, and had a way to limit the wait
>> time).
>>
>> However, an abstraction like synchronic is a bit higher level than
>> spinLoopHint(). One of the main drivers for spinLoopHint() is direct-use
>> cases by programs and libraries outside of the core JDK. E.g. spinning
>> indefinitely (or for limited periods) on dedicated vcores is a common
>> practice in high performance messaging and communications stacks, as is not
>> unreasonable on today's many-core systems. E.g. seeing 4-8 threads "pinned"
>> with spinning loops is common place in trading applications, in kernel
>> bypass network stacks, and in low latency messaging. And the conditions for
>> spins are often more complicated than those expressible by synchronic (e.g.
>> watching multiple addresses in a mux'ed spin). I'm sure a higher level
>> abstraction for a spin wait can be enriched enough to come close, but there
>> are many current use cases that aren't covered by any currently proposed
>> abstraction.
>>
>> So, I like the idea of an abstraction that would allow uncomplicated
>> spin-wait use, but I also think that direct access to spinLoopHint() is
>> very much needed. They don't contradict each other.
>>
>> — Gil.
>>
>> On Oct 6, 2015, at 9:49 AM, Hans Boehm < <boehm at acm.org>boehm at acm.org>
>> wrote:
>>
>> If you haven't seen it, you may also be interested in
>>
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0126r0.pdf
>>
>> which seems to be a very different perspective on roughly the same space.
>>
>> On Tue, Oct 6, 2015 at 8:11 AM, Gil Tene < <gil at azulsystems.com>
>> gil at azulsystems.com> wrote:
>>
>>> I posted a draft JEP about adding spinLoopHint() for discussion on
>>> core-libs-dev and hotspot-dev. May be of interest to this group. The main
>>> focus is supporting outside-of-the-JDK spinning needs (for which there are
>>> multiple eager users), but it could/may be useful under the hood in j.u.c.
>>>
>>>
>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-October/035613.html
>>>
>>> See draft JEP, tests, and links to prototype JDKs to play with here:
>>> https://github.com/giltene/GilExamples/tree/master/SpinHintTest
>>>
>>> — Gil.
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing listConcurrency-interest at cs.oswego.eduhttp://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
>> _______________________________________________
>> Concurrency-interest mailing list
>> Concurrency-interest at cs.oswego.edu
>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20151013/2388d5f1/attachment.html>


More information about the Concurrency-interest mailing list