[concurrency-interest] Monitoring Tool

William Louth (JINSPIRED.COM) william.louth at jinspired.com
Wed Aug 4 00:07:23 EDT 2010


  [last post on this thread]

I wanted to add that I have designed OpenCore's Probes (Metering) and 
Metrics (Monitoring) API's with the hope that JVM could directly 
support  meter integration (low level thread specific jvm counters) as 
well as providing an actual implementation of each API directly within 
the runtime via a SPI.

My vision is for activity based metering to be present in all future 
runtimes (whatever the language or environment) and to be used to create 
new dynamic libraries (and services) which are much more aware of the 
performance, cost (monetary or unit based) and capacity (on-demand) 
requirements and characteristics of a software's execution whether it 
relates to in-process task scheduling, out-of-process metered service 
interactions, or distributed job queuing & routing.

I have for a number of years tried to get various JVM vendors on board a 
possible standardization but that's a sad & depressing tale for another day.

William

On 04/08/2010 02:05, William Louth (JINSPIRED.COM) wrote:
>  Hi Gregg,
>
> It would be great if you could publish real-world data that actual 
> quantifies your concerns and opinion.
>
> Metered code can't become a hotspot if the metering runtime is 
> strategy based, configured correctly (in terms of meter, threshold, 
> warm-up, & statistic), and communication between the instrumentation & 
> measurement code is bi-directional allowing for evaluation short 
> cutting to execute in under 1 nanosecond. Which is the case for 
> OpenCore under extreme micro-benchmarking tests.
>
> That said OpenCore does front thread locals with fast & small caches 
> that play on the thread context & cpu coupling to some degree.
>
> You might be surprised to find out that for some of our customers who 
> avail of the open interface to our metering data within the runtime 
> and & their scheduling code they actually make performance gains.
>
> In addition you should try to look at the big picture which goes 
> beyond a single process execution lifecycle. The metering data 
> produced by OpenCore leads to significant gains across releases 
> something which would never happen if one was thinking short-term and 
> (thread) local.
>
> For our customers having production quality data is the biggest 
> contributor to increases in quality of the code and its execution. I 
> can't recall a case in which the metering hotspot strategy produced a 
> red herring after a few iterations of a fully warmed up runtime.
>
> I would also like to point out that in two recent proof of concepts by 
> major fx trading platforms it was found that OpenCore's strategy based 
> metering runtime was the only realistic solution on the market for 
> production monitoring of applications with tx time intervals in the 
> microsecond range (approx 300). Hopefully we can eventually get these 
> reports published.
>
> There will always be exceptions to the rule but for those I doubt 
> there is any solution available today (we come as close as is possible 
> when used & configured appropriately). Please do not get me started on 
> call stack sampling.
>
> William
>
> On 03/08/2010 23:57, Gregg Wonderly wrote:
>> I know this seems pretty harmless, but for extremely light CPU use 
>> applications, even the use of the read/write lock on a thread local 
>> value can create a marked change in the contention points of an 
>> application.  I know that it can be useful information, but there are 
>> places where these change invert the real world places of interest 
>> and can cause any metered code to suddenly become the hot spot.
>>
>> Gregg Wonderly
>>
>> Ben Manes wrote:
>>> Most likely their using an inverted read/write lock approach where 
>>> the count is kept thread-local and aggregated by a monitoring 
>>> thread. I haven't looked at their implementation, but that's a 
>>> standard idiom for avoiding contention for capturing statistical 
>>> information.
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>> *From:* Gregg Wonderly <gregg at cytetech.com>
>>> *To:* William Louth (JINSPIRED.COM) <william.louth at jinspired.com>
>>> *Cc:* concurrency-interest at cs.oswego.edu; gregg.wonderly at pobox.com
>>> *Sent:* Thu, July 29, 2010 12:02:04 PM
>>> *Subject:* Re: [concurrency-interest] Monitoring Tool
>>>
>>> Again, one of the big issues for me, is that typically, such 
>>> metering requires atomic counting, which, itself, is a form of 
>>> contention.  This contention can get injected into previously 
>>> uncontended blocks and redistribute the timing through particular 
>>> parts of the code which then don't reveal the actual behavior in 
>>> normal execution.
>>>
>>> Where can I see more information about how issues like this are 
>>> dealt with?
>>>
>>> Gregg Wonderly
>>>
>>> William Louth (JINSPIRED.COM <http://JINSPIRED.COM>) wrote:
>>> >  It can do this and more but maybe I should have factor in the 
>>> likelihood that you would have only skimmed over the links I email - 
>>> time is money. Sorry for not directly pointing this out but as the 
>>> architect of both the Probes and Metrics Open API's I thought this 
>>> would be rather obvious.
>>> >
>>>>  OpenCore supports many meters some of which are contention related: 
>>> http://opencore.jinspired.com/?page_id=981#p:core A lot of these are 
>>> not time related.
>>> >
>>> > OpenCore's Probes metering can be automatically mapped to Metrics 
>>> to correlate across multiple threads with other measurements not 
>>> metered based.
>>>>  http://opencore.jinspired.com/?page_id=377
>>> >
>>> > OpenCore supports the reporting of metering at thread and probes 
>>> level. You indicate transaction points to the runtime via config and 
>>> then see which particular probes (activities) and meters (resources: 
>>> monitors) contributed the most. See transaction probes provider 
>>> under Guides.
>>>>  http://opencore.jinspired.com/?page_id=772
>>> >
>>> > On 26/07/2010 23:56, Gregg Wonderly wrote:
>>> >> This doesn't really point at concurrency issues, only in 
>>> execution time of compute bound execution, and perhaps some simple 
>>> linear scaling of complexity.
>>> >>
>>> >> When you have 1000s of instructions and multiple code paths, 
>>> which include synchronization, and random latency injection, I am 
>>> not sure that you can see how threads are "waiting in line" until 
>>> you see the quadratic change in execution time that is typically 
>>> visible when high contention occurs.
>>> >>
>>> >> Maybe you can point out where this tool provides for the ability 
>>> to monitor lock contention and count stalled threads and otherwise 
>>> see the real contention that develops over time as load increases on 
>>> a highly contended code segment?
>>> >>
>>> >> If you could "time" the execution interval through all code 
>>> paths, and then look at the "percentage" of threads and time through 
>>> each path, and then see that the "largest" latency block was in a 
>>> very common code path, you could perhaps then say this was the area 
>>> to look at.  All of that analysis might not really be possible 
>>> though, because of the exponential potential complexity of code path 
>>> coverage.
>>> >>
>>> >> Gregg Wonderly
>>> >>
>>> >> William Louth (JINSPIRED.COM) wrote:
>>>> >>  JINSPIRED'S OpenCore (http://opencore.jinspired.com) metering & 
>>> metrics runtime has built in meters for thread contention metering 
>>> (activity based) @see blocking.time and blocking.count.
>>> >>>
>>>> >> You can easily extend it with your own custom counters or resource 
>>> measures mapped to meters. And you don't always need to measure 
>>> time: 
>>> http://williamlouth.wordpress.com/2010/06/11/no-latency-application-performance-analysis-when-wall-clock-time-is-simply-too-slow/ 
>>>
>>> >>>
>>>> >> Queuing can also be aggregated at various namespace levels: 
>>> http://williamlouth.wordpress.com/2010/05/20/metered-software-service-queues/ 
>>>
>>> >>>
>>>> >> More related articles: 
>>> http://williamlouth.wordpress.com/category/profiling/
>>> >>>
>>> >>> On 26/07/2010 21:13, Gregg Wonderly wrote:
>>> >>>> Per thread latency measurements with 1, 2, 10 and 100 threads 
>>> will often tell you a lot about how contention is affecting the 
>>> execution time.  When you get to 100, a thread dump will often 
>>> reveal where everyone is standing in line...
>>> >>>>
>>> >>>> Gregg Wonderly
>>> >>>>
>>> >>>> David Holmes wrote:
>>> >>>>> Kendall,
>>> >>>>>
>>> >>>>> In my opinion a monitoring tool looking at the lock 
>>> acquisition time, or CAS attempts, won't give you much insight into 
>>> whether to use a blocking or non-blocking approach. You need to 
>>> measure the performance of your application logic as a whole, 
>>> utilising the two different approaches. Afterall how can you compare 
>>> locking times with number of CAS attempts in general?
>>> >>>>>
>>> >>>>> David Holmes
>>> >>>>>
>>> >>>>>    -----Original Message-----
>>> >>>>>    *From:* concurrency-interest-bounces at cs.oswego.edu 
>>> <mailto:concurrency-interest-bounces at cs.oswego.edu>
>>> >>>>>    [mailto:concurrency-interest-bounces at cs.oswego.edu 
>>> <mailto:concurrency-interest-bounces at cs.oswego.edu>]*On Behalf Of
>>> >>>>>    *Kendall Moore
>>> >>>>>    *Sent:* Sunday, 25 July 2010 4:14 PM
>>> >>>>>    *To:* concurrency-interest at cs.oswego.edu 
>>> <mailto:concurrency-interest at cs.oswego.edu>
>>> >>>>>    *Subject:* [concurrency-interest] Monitoring Tool
>>> >>>>>
>>> >>>>>    Greetings all,
>>> >>>>>
>>> >>>>>    Is there a common consensus on which monitoring tools are 
>>> best to
>>> >>>>>    use when writing parallel apps?  To be more specific, I 
>>> would like
>>> >>>>>    to be able to know how many times a given thread has to try 
>>> to CAS
>>> >>>>>    before succeeding.  Also, the ability to see how long a thread
>>> >>>>>    waits to acquire a lock would be useful as well.  The end goal
>>> >>>>>    would, in my particular case, would be to compare these in 
>>> order to
>>> >>>>>    determine if a non-blocking approach would be more 
>>> effective in a
>>> >>>>>    given situation than a lock-based approach.  Any help would 
>>> be much
>>> >>>>>    appreciated!
>>> >>>>>
>>> >>>>>    --    Kendall Moore
>>> >>>>>
>>> >>>>>
>>> >>>>> 
>>> ------------------------------------------------------------------------ 
>>>
>>> >>>>>
>>> >>>>> _______________________________________________
>>> >>>>> Concurrency-interest mailing list
>>> >>>>> Concurrency-interest at cs.oswego.edu 
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>>> >>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> Concurrency-interest mailing list
>>> >>>> Concurrency-interest at cs.oswego.edu 
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>> >>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>> >>>>
>>> >>>>
>>> >>> _______________________________________________
>>> >>> Concurrency-interest mailing list
>>> >>> Concurrency-interest at cs.oswego.edu 
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>> >>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >
>>> >
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu 
>>> <mailto:Concurrency-interest at cs.oswego.edu>
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>>
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>> _______________________________________________
>>> Concurrency-interest mailing list
>>> Concurrency-interest at cs.oswego.edu
>>> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>>
>>
>>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
> http://cs.oswego.edu/mailman/listinfo/concurrency-interest
>
>


More information about the Concurrency-interest mailing list