[concurrency-interest] spinLoopHint() JEP draft discussion

Oleksandr Otenko oleksandr.otenko at oracle.com
Mon Oct 12 08:04:55 EDT 2015

On 10/10/2015 16:54, Gil Tene wrote:
>> On Oct 9, 2015, at 3:33 PM, Oleksandr Otenko 
>> <oleksandr.otenko at oracle.com <mailto:oleksandr.otenko at oracle.com>> wrote:
>> Variable X transitions from value A to value B over time t.
> For applications that care about latency, the question would be 
> phrased as "Variable X transitions from value A to value B *at* time t."

ok, I'll rephrase my statement: "Variable X transitions from value A to 
value B over/during time dt." :-)

It doesn't matter what the absolute value of t is. But if you observe 
value is not B, you are going to wait up to dt units of time - owing to 
the nature of the transition from A to B. Then in this world there is 
also some overhead from observing it is now B. Saying "make that 
overhead as small as possible" is not accurate. Saying "make that 
overhead less than 100 nanoseconds" is too strict - why would you care 
whether it is 100 nanoseconds, if dt is 10 milliseconds.

Granted, there will be cases where you'd justify the "100 nanosecond" 
overhead even if dt is "10 ms", hence my remark that it really depends 
on what the cost function is, but the main consumer of concurrency 
primitives will want to relax the overhead to be some function of dt - 
since the average wait time is already a function of dt.

>> What is the /expected/ reaction time of a spinning thread?
>> The answer is - it really depends on your cost model.
>> If you are waiting for X to become B, you may be waiting for up to t 
>> units of time. What difference would it make in your cost model, if 
>> instead it waited for N% of t more? When N% of t becomes larger than 
>> time to switch context, you yield.
> Imagine a person working at the supermarket checkout counter that says 
> "I've been standing here for the past 30 minutes and no customer has 
> come to my line, so what difference would it make if I step away for 5 
> minutes for a smoke?".

If we continue this analogy long enough, they do leave the checkout (I 
doubt it is for a smoke - more like to stack shelves), and the peers 
press the button to summon them back when getting congested.

You may be able to optimize the levels of adrenaline in the customer's 
bloodstream, if they see the cashier race to the checkout (instead of 
leisurely walk).

> How long you've been waiting for X to become B has nothing to do with 
> the reaction time you are allowed to have when it does.

A less important point - let's define reaction time.

At the moment I am looking at it like so: we can't measure the time 
between events in two different threads, so we have a third timeline 
observing events in both threads. But there is no "reaction time" on it:

|   |   |
X   A   |
|   |   XA+
|   |   |
|   B   |
|   |   B+
|   |   |
B-  |   |
|   |   |
Y   |   |
|   |   Y+
|   |   |

Suppose for simplicity that X "started the wait to make progress" and A 
"started transition" are observed simultaneously at the point XA+. 
Suppose B "finished transition" is observed at B+. Suppose Y "responded 
to transition to B" is observed at point Y+. Suppose Y+ also tells the 
observer the time dy between B- "noticed B" and Y. Here "concurrency 
overhead" is /perceived/ as (Y+)-dy-(B+). It is independent of XA+ and 
Y+, but whether it makes sense to reduce it really depends on the 
magnitude of Y-X, an estimate of which is (Y+)-(XA+), and on the cost 
function or SLA.

You might say that "concurrency overhead" is the "reaction time", but it 
really is /two/ or more "reaction times", even if you make the thread 
transitioning A to B the observer thread, instead of having a separate 
observer thread.

A more important point - reducing the overheads makes sense when they 
constitute an important part of the overall time.

Maybe you are promoting the "the wait time is so expensive" case. Adding 
support for that is a good thing. But most cases would want some back 
off according to cumulative wait time.


>> But this is a selfish model (my wait is more important than letting 
>> the others use the CPU).
> Selfishness is in the eye of the beholder. From a reaction time point 
> of view, yielding would be the selfish thing (like going out for a 
> smoke in the middle of your shift). Applications that are measured by 
> their reaction time behavior (which is true for most applications) can 
> usually justify the computer resources they own/rent/use. They not 
> there to "share with others", they are there to do a job and do it well.
> And while spinning can certainly be used for doing stupid and wasteful 
> things for no good reason, the same can be said about linked lists. 
> Applications that spin cpus instead of blocking often have great 
> reasons for doing so.
>> Alex
>> On 07/10/2015 02:41, Gil Tene wrote:
>>> When comparing spinLoopHint() to Thread.yield(), we're talking about 
>>> different orders of magnitude, and different motivations.
>>> On the motivation side: A major reason for using spinLoopHint() is 
>>> to improve the reaction time of a spinning thread (from the time the 
>>> event it is spinning for actually occurs until it actually reacts to 
>>> it). Power savings is a another benefit. Thread.yield() doesn't help 
>>> with either.
>>> On the orders of magnitude side: Thread.yield involves making a 
>>> system call. This makes it literally 10x+ longer to react than 
>>> spinning without it, and certainly pulls in the opposite direction 
>>> of spinLoopHint().
>>>> On Oct 6, 2015, at 1:15 PM, Nathan Reynolds 
>>>> <nathan.reynolds at oracle.com <mailto:nathan.reynolds at oracle.com>> wrote:
>>>> I am not fully up to speed on this topic. However, why not call 
>>>> Thread.yield()?  If there are no other threads waiting to get on 
>>>> the processor, then Thread.yield() does nothing.  The current 
>>>> thread keeps executing.  If there are threads waiting to get on the 
>>>> processor, then current thread goes to the end of the run queue and 
>>>> another thread gets on the processor (i.e. a context switch).  The 
>>>> thread will run again after the other threads ahead of it either 
>>>> block, call yield() or use up their time slice. The only time 
>>>> Thread.yield() will do anything is if *all* of the processors are 
>>>> busy (i.e. 100% CPU utilization for the machine).  You could run 
>>>> 1000s of threads in tight Thread.yield() loops and all of the 
>>>> threads will take a turn to go around the loop one time and then go 
>>>> to the end of the run queue.
>>>> I've tested this on Windows and Linux (Intel 64-bit processors).
>>>> Some people are very afraid of context switches.  They think that 
>>>> context switches are expensive.  This was true of very old Linux 
>>>> kernels.  Now a days, it costs 100s of nanoseconds to do a context 
>>>> switch.  Of course, the cache may need to be reloaded with the data 
>>>> relevant for the running thread.
>>>> -Nathan
>>>> On 10/6/2015 11:56 AM, Gil Tene wrote:
>>>>> A variant of synchronic for j.u.c would certainly be cool to have. 
>>>>> Especially if it supports a hint that makes it actually spin 
>>>>> forever rather than block (this may be what expect_urgent means, 
>>>>> or maybe a dedicated spin level is needed). An implementation 
>>>>> could use spinLoopHint() under the hood, or other things where 
>>>>> appropriate (e.g. if MWAIT was usefully available in user mode in 
>>>>> some future, and had a way to limit the wait time).
>>>>> However, an abstraction like synchronic is a bit higher level than 
>>>>> spinLoopHint(). One of the main drivers for spinLoopHint() is 
>>>>> direct-use cases by programs and libraries outside of the core 
>>>>> JDK. E.g. spinning indefinitely (or for limited periods) on 
>>>>> dedicated vcores is a common practice in high performance 
>>>>> messaging and communications stacks, as is not unreasonable on 
>>>>> today's many-core systems. E.g. seeing 4-8 threads "pinned" with 
>>>>> spinning loops is common place in trading applications, in kernel 
>>>>> bypass network stacks, and in low latency messaging. And the 
>>>>> conditions for spins are often more complicated than those 
>>>>> expressible by synchronic (e.g. watching multiple addresses in a 
>>>>> mux'ed spin). I'm sure a higher level abstraction for a spin wait 
>>>>> can be enriched enough to come close, but there are many current 
>>>>> use cases that aren't covered by any currently proposed abstraction.
>>>>> So, I like the idea of an abstraction that would allow 
>>>>> uncomplicated spin-wait use, but I also think that direct access 
>>>>> to spinLoopHint() is very much needed. They don't contradict each 
>>>>> other.
>>>>> — Gil.
>>>>>> On Oct 6, 2015, at 9:49 AM, Hans Boehm <boehm at acm.org> wrote:
>>>>>> If you haven't seen it, you may also be interested in
>>>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0126r0.pdf
>>>>>> which seems to be a very different perspective on roughly the 
>>>>>> same space.
>>>>>> On Tue, Oct 6, 2015 at 8:11 AM, Gil Tene <gil at azulsystems.com> wrote:
>>>>>>     I posted a draft JEP about adding spinLoopHint() for
>>>>>>     discussion on core-libs-dev and hotspot-dev. May be of
>>>>>>     interest to this group. The main focus is supporting
>>>>>>     outside-of-the-JDK spinning needs (for which there are
>>>>>>     multiple eager users), but it could/may be useful under the
>>>>>>     hood in j.u.c.
>>>>>>     http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-October/035613.html
>>>>>>     See draft JEP, tests, and links to prototype JDKs to play
>>>>>>     with here:
>>>>>>     https://github.com/giltene/GilExamples/tree/master/SpinHintTest
>>>>>>     — Gil.
>>>>>>     _______________________________________________
>>>>>>     Concurrency-interest mailing list
>>>>>>     Concurrency-interest at cs.oswego.edu
>>>>>>     <mailto:Concurrency-interest at cs.oswego.edu>
>>>>>>     http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>>> _______________________________________________
>>>>> Concurrency-interest mailing list
>>>>> Concurrency-interest at cs.oswego.edu
>>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>> _______________________________________________
>>>> Concurrency-interest mailing list
>>>> Concurrency-interest at cs.oswego.edu 
>>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20151012/0b47193a/attachment-0001.html>

More information about the Concurrency-interest mailing list