Re: [RFC PATCH] count_lim_sig: Add pair of smp_wmb() and smp_rmb()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018/10/18 08:15:19 -0700, Paul E. McKenney wrote:
> On Thu, Oct 18, 2018 at 10:03:56PM +0900, Akira Yokosawa wrote:
>> On 2018/10/17 17:37:39 -0700, Paul E. McKenney wrote:
>>> On Thu, Oct 18, 2018 at 07:21:38AM +0900, Akira Yokosawa wrote:
>>>> On 2018/10/17 08:10:52 -0700, Paul E. McKenney wrote:
>>>>> On Tue, Oct 16, 2018 at 08:04:00AM +0900, Akira Yokosawa wrote:
>>>>>> >From 7b01fc0f19cfa010536d7eb53e4d0cda1e6b801f Mon Sep 17 00:00:00 2001
>>>>>> From: Akira Yokosawa <akiyks@xxxxxxxxx>
>>>>>> Date: Mon, 15 Oct 2018 23:46:52 +0900
>>>>>> Subject: RFC [PATCH] count_lim_sig: Add pair of smp_wmb() and smp_rmb()
>>>>>>
>>>>>> This message-passing pattern requires smp_wmb()--smp_rmb() pairing.
>>>>>>
>>>>>> Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx>
>>>>>> ---
>>>>>> Hi Paul,
>>>>>>
>>>>>> I'm not sure this addition of memory barriers is actually required,
>>>>>> but it does look like so.
>>>>>>
>>>>>> And I'm aware that you have avoided using weaker memory barriers in
>>>>>> CodeSamples.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>> Hello, Akira,
>>>>>
>>>>> I might be missing something, but it looks to me like this ordering is
>>>>> covered by heavyweight ordering in the signal handler entry/exit and
>>>>> the gblcnt_mutex.  So what sequence of events leads to the failiure
>>>>> scenario that you are seeing?
>>>>
>>>> So the fastpaths in add_count() and sub_count() are not protected by
>>>> glbcnt_mutex.  The slowpath in flush_local_count() waits the transition
>>>> of theft from REQ to READY, clears counter and countermax, and finally
>>>> assign IDLE to theft.
>>>>
>>>> So, the fastpaths can see (theft == IDLE) but see a non-zero value of
>>>> counter or countermax, can't they?
>>>
>>> Maybe, maybe not.  Please lay out a sequence of events showing a problem,
>>> as in load by load, store by store, line by line.  Intuition isn't as
>>> helpful as one might like for this kind of stuff.  ;-)
>>
>> Gotcha!
>>
>> I've not exhausted the timing variations, but now I see when
>> split_local_count() sees (*theft@[t] == THEFT_READY), counter part of
>> add_count() or sub_count() has exited the fastpath (marked by
>> counting == 1).
>>
>> So the race I imagined has never existed.
> 
> I know that feeling!!!
> 
>> Thanks for your nice suggestion!
> 
> Well, there might well be another race.  My main concern is whether or not
> signal-handler entry/exit really provides full ordering on all platforms.
> 
> Thoughts?

Does your concern related to the lack of memory barrier at the entry of
flush_local_count_sig() in Listing 5.17?

        Akira  

> 
> 							Thanx, Paul
> 
>>>> One theory to prevent this from happening is because all the per-thread
>>>> variables of a thread reside in a single cache line, and if the fastpaths
>>>> see the updated value of theft, they are guaranteed to see the latest
>>>> values of both counter and countermax.
>>>
>>> Good point, but we need to avoid that sort of assumption unless we
>>> placed the variables into a struct and told the compiler to align it
>>> appropriately.  And even then, hardware architectures normally don't
>>> make this sort of guarantee.  There is too much that can go wrong, from
>>> ECC errors to interrupts at just the wrong time, and much else besides.
>>
>> Absolutely!
>>
>>         Thanks, Akira
>>
>>>
>>> 							Thanx, Paul
>>>
>>>> I might be completely missing something, though.
>>>>
>>>>         Thanks, Akira 
>>>>
>>>>>
>>>>> 							Thanx, Paul
>>>>>
[...]



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux