On Wed, 8 Feb 2023 14:15:22 -0800, Paul E. McKenney wrote: > On Wed, Feb 08, 2023 at 07:26:58PM +0900, Akira Yokosawa wrote: >> On Wed, 8 Feb 2023 17:47:31 +0900, Akira Yokosawa wrote: >>> Hi, >>> >>> On Tue, 7 Feb 2023 19:41:02 -0800, Paul E. McKenney wrote: >>>> On Wed, Feb 08, 2023 at 12:07:20AM -0300, Leonardo Brás wrote: >>>>> Hello Paul, >>>>> >>>>> I have been reading the book, until I stumbled on Quick Quiz 3.7, >>>>> Table E.1: Performance of Synchronization Mechanisms >>>>> on 16-CPU 2.8 GHz Intel X5550 (Nehalem) System >>>>> >>>>> <Copying from source, since the PDF is a little tricky> >>>>> >>>>> The first part looks like: >>>>> >>>>> Clock period & 0.4 & 1.0 \\ >>>>> Same-CPU CAS & 12.2 & 33.8 \\ >>>>> Same-CPU lock & 25.6 & 71.2 \\ >>>>> Blind CAS & 12.9 & 35.8 \\ >>>>> CAS & 7.0 & 19.4 \\ >>>>> >>>>> In this case, what would be the last lines "Blind CAS" and "CAS" referring to ? >>>>> >>>>> (For a second I thought it could be "In-Core Blind CAS" and "In-Core CAS" like >>>>> in Table 3.1, but that would not make sense: This "CAS" is faster than the >>>>> previous "Same-CPU CAS". ) >>>> >>>> I was surprised myself, but those measurements are quite real. My best >>>> guess is that the two threads in the core are able to overlap their >>>> accesses, while the single CPU must do everything sequentially. >>> >>> Paul, do you remember how you obtained the data set? >>> There are several data sets under CodeSamples/cpu/data/, but I don't >>> see the one corresponds to the table. >>> >>> The code for collecting these data was added in CodeSamples/cpu/ >>> by commit 81989d7483e2 ("cpu: Reproduce the old cache-to-cache >>> latency measurement code") in 2020. And the next commit 2fc05ca07edc >>> ("api-pthreads.h: Use clock_gettime() and check sched_setaffinity()") >>> improved the stability of reproduced code. >>> >>> This table was first added in commit 38fd945ff401 ("Fill out CPU >>> chapter, including adding Nehalem data.") in 2009. >>> The data have never been updated since. >>> >>> I'm kind of suspecting the "7.0 us" which surprised you at the time >> I mean, "7.0 ns" >> >> Thanks, Akira >> >>> might have been an outlier due to some disturbance discussed in >>> Appendix A.3 "What Time Is It?". >>> >>> I'm not sure, just guessing... > > My surprise caused me to beat on it, and it was persistent. I see. So the "outlier" was the microarchitecture of that X5550 (Nehalem), I guess. I'd love to reproduce the behavior if at all possible. > > But I cannot find the raw data, either, so maybe I should delete that > table. Though I really do like the fact that it is surprising, based > on a hope that it convinces readers to expect the unexpected. That episode would be a good Quick Quiz if the Answer to QQz could have a nested QQz inside it. Unfortunately that is not possible... Thanks, Akira > > Thanx, Paul > >>> Thanks, Akira >>> >>>> >>>> Strange, but whatever the reason, true! ;-) >>>> >>>> Thanx, Paul