[concurrency-interest] The very best CAS loop

Peter Levart peter.levart at gmail.com
Wed Sep 28 15:45:50 EDT 2016


Hi Andrew,


On 09/28/2016 07:55 PM, Andrew Haley wrote:
> One important thing to realize: even when we have false sharing and
> the benchmark runs much more slowly, there is still not a significant
> percentage of weak CAS fails.  The benchmark is slower because the
> cache line is pinging back-and-forth between L1 caches so the L1 cache
> misses, not because of a high weak CAS failure rate.  I suspect this
> is typical: once the line has been fetched by LL and is in exclusive
> state, there is only a tiny window before SC.
>
> The few weak CAS failures I am seeing correspond with delays of
> extremly long duration, and sometimes as long as a few milliseconds.
> The only really likely explanation for that is that the thread is
> descheduled or otherwise interrupted by the OS kernel and this
> obviously would cause a weak CAS to fail.

I see. So false sharing can only provoke spurious LL/SC failure when the 
cache line is stolen between LL and SC? This makes sense. That's why 
results between strong and weak CAS are comparable even on AArch64. Weak 
CAS is not spuriously failing much if cache-line contention is not 
extremely high. Do you believe other HW platforms that implement weak 
CAS have similar spurious failure characteristics?

Thanks for helping understand the mechanics of new weak CAS operations.

Regards, Peter

> I've used the AuxCounters feature to record when weak CAS fails.
> cr.openjdk.java.net:~aph/test/GetAndUpdateBench3
>
> P.S.: shade, please print out the number of instances of case1 and
> case2, not just their average time.  Thanks.
>
> Andrew.
>
>
> Benchmark                                     (updateFnCpu)  Mode  Cnt        Score         Error  Units
> GetAndUpdateBench3.shade                                  1  avgt   20      135.842 ?      12.581  ns/op
> GetAndUpdateBench3.shade:case1                            1  avgt   20      135.646 ?      12.689  ns/op
> GetAndUpdateBench3.shade:case2                            1  avgt   20    14738.834 ?   15183.891  ns/op
> GetAndUpdateBench3.shade:getAndUpdate1_shade              1  avgt   20      140.626 ?      22.241  ns/op
> GetAndUpdateBench3.shade:getAndUpdate2_shade              1  avgt   20      131.058 ?      12.259  ns/op
> GetAndUpdateBench3.shade:total                            1  avgt   20      113.007 ?       5.217  ns/op
> GetAndUpdateBench3.shade                                 10  avgt   20      146.594 ?       0.575  ns/op
> GetAndUpdateBench3.shade:case1                           10  avgt   20      146.465 ?       0.586  ns/op
> GetAndUpdateBench3.shade:case2                           10  avgt   20  1248338.412 ? 1145549.327  ns/op
> GetAndUpdateBench3.shade:getAndUpdate1_shade             10  avgt   20      146.543 ?       0.662  ns/op
> GetAndUpdateBench3.shade:getAndUpdate2_shade             10  avgt   20      146.644 ?       0.562  ns/op
> GetAndUpdateBench3.shade:total                           10  avgt   20      145.162 ?       2.839  ns/op
> GetAndUpdateBench3.shade                                 20  avgt   20      173.579 ?       0.901  ns/op
> GetAndUpdateBench3.shade:case1                           20  avgt   20      173.500 ?       0.842  ns/op
> GetAndUpdateBench3.shade:case2                           20  avgt   20   916952.445 ? 1147957.020  ns/op
> GetAndUpdateBench3.shade:getAndUpdate1_shade             20  avgt   20      173.498 ?       1.133  ns/op
> GetAndUpdateBench3.shade:getAndUpdate2_shade             20  avgt   20      173.659 ?       0.728  ns/op
> GetAndUpdateBench3.shade:total                           20  avgt   20      172.134 ?       2.710  ns/op
> GetAndUpdateBench3.shade                                 50  avgt   20      300.589 ?      23.422  ns/op
> GetAndUpdateBench3.shade:case1                           50  avgt   20      300.537 ?      23.457  ns/op
> GetAndUpdateBench3.shade:case2                           50  avgt   20  6912133.566 ? 4212701.775  ns/op
> GetAndUpdateBench3.shade:getAndUpdate1_shade             50  avgt   20      300.559 ?      23.396  ns/op
> GetAndUpdateBench3.shade:getAndUpdate2_shade             50  avgt   20      300.618 ?      23.452  ns/op
> GetAndUpdateBench3.shade:total                           50  avgt   20      292.863 ?      15.469  ns/op
> GetAndUpdateBench3.shade                                100  avgt   20      453.153 ?       7.168  ns/op
> GetAndUpdateBench3.shade:case1                          100  avgt   20      452.858 ?       6.940  ns/op
> GetAndUpdateBench3.shade:case2                          100  avgt   20  1877949.365 ? 2351908.185  ns/op
> GetAndUpdateBench3.shade:getAndUpdate1_shade            100  avgt   20      452.429 ?       6.266  ns/op
> GetAndUpdateBench3.shade:getAndUpdate2_shade            100  avgt   20      453.877 ?       8.503  ns/op
> GetAndUpdateBench3.shade:total                          100  avgt   20      446.519 ?       3.611  ns/op

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://cs.oswego.edu/pipermail/concurrency-interest/attachments/20160928/95ddb135/attachment.html>


More information about the Concurrency-interest mailing list