Re: Question about Table E.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 8 Feb 2023 14:15:22 -0800, Paul E. McKenney wrote:
> On Wed, Feb 08, 2023 at 07:26:58PM +0900, Akira Yokosawa wrote:
>> On Wed, 8 Feb 2023 17:47:31 +0900, Akira Yokosawa wrote:
>>> Hi,
>>>
>>> On Tue, 7 Feb 2023 19:41:02 -0800, Paul E. McKenney wrote:
>>>> On Wed, Feb 08, 2023 at 12:07:20AM -0300, Leonardo Brás wrote:
>>>>> Hello Paul,
>>>>>
>>>>> I have been reading the book, until I stumbled on Quick Quiz 3.7,
>>>>> Table E.1: Performance of Synchronization Mechanisms
>>>>> on 16-CPU 2.8 GHz Intel X5550 (Nehalem) System
>>>>>
>>>>> <Copying from source, since the PDF is a little tricky>
>>>>>
>>>>> The first part looks like:
>>>>>
>>>>>         Clock period            &           0.4 &           1.0 \\
>>>>>         Same-CPU CAS            &          12.2 &          33.8 \\
>>>>>         Same-CPU lock           &          25.6 &          71.2 \\
>>>>>         Blind CAS               &          12.9 &          35.8 \\
>>>>>         CAS                     &           7.0 &          19.4 \\
>>>>>  
>>>>> In this case, what would be the last lines "Blind CAS" and "CAS" referring to ? 
>>>>>
>>>>> (For a second I thought it could be "In-Core Blind CAS" and "In-Core CAS" like
>>>>> in Table 3.1, but that would not make sense: This "CAS" is faster than the
>>>>> previous "Same-CPU CAS". )
>>>>
>>>> I was surprised myself, but those measurements are quite real.  My best
>>>> guess is that the two threads in the core are able to overlap their
>>>> accesses, while the single CPU must do everything sequentially.
>>>
>>> Paul, do you remember how you obtained the data set?
>>> There are several data sets under CodeSamples/cpu/data/, but I don't
>>> see the one corresponds to the table.
>>>
>>> The code for collecting these data was added in CodeSamples/cpu/
>>> by commit 81989d7483e2 ("cpu: Reproduce the old cache-to-cache
>>> latency measurement code") in 2020. And the next commit 2fc05ca07edc
>>> ("api-pthreads.h: Use clock_gettime() and check sched_setaffinity()")
>>> improved the stability of reproduced code.
>>>
>>> This table was first added in commit 38fd945ff401 ("Fill out CPU
>>> chapter, including adding Nehalem data.") in 2009.
>>> The data have never been updated since.
>>>
>>> I'm kind of suspecting the "7.0 us" which surprised you at the time
>> I mean,                      "7.0 ns"
>>
>>         Thanks, Akira
>>
>>> might have been an outlier due to some disturbance discussed in
>>> Appendix A.3 "What Time Is It?".
>>>
>>> I'm not sure, just guessing...
> 
> My surprise caused me to beat on it, and it was persistent.

I see.  So the "outlier" was the microarchitecture of that
X5550 (Nehalem), I guess.
I'd love to reproduce the behavior if at all possible.

> 
> But I cannot find the raw data, either, so maybe I should delete that
> table.  Though I really do like the fact that it is surprising, based
> on a hope that it convinces readers to expect the unexpected.

That episode would be a good Quick Quiz if the Answer to QQz
could have a nested QQz inside it.
Unfortunately that is not possible...

        Thanks, Akira

> 
> 							Thanx, Paul
> 
>>>         Thanks, Akira
>>>
>>>>
>>>> Strange, but whatever the reason, true!  ;-)
>>>>
>>>> 							Thanx, Paul




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux