On Fri, Oct 19, 2018 at 07:43:57AM +0900, Akira Yokosawa wrote: > On 2018/10/18 08:15:19 -0700, Paul E. McKenney wrote: > > On Thu, Oct 18, 2018 at 10:03:56PM +0900, Akira Yokosawa wrote: > >> On 2018/10/17 17:37:39 -0700, Paul E. McKenney wrote: > >>> On Thu, Oct 18, 2018 at 07:21:38AM +0900, Akira Yokosawa wrote: > >>>> On 2018/10/17 08:10:52 -0700, Paul E. McKenney wrote: > >>>>> On Tue, Oct 16, 2018 at 08:04:00AM +0900, Akira Yokosawa wrote: > >>>>>> >From 7b01fc0f19cfa010536d7eb53e4d0cda1e6b801f Mon Sep 17 00:00:00 2001 > >>>>>> From: Akira Yokosawa <akiyks@xxxxxxxxx> > >>>>>> Date: Mon, 15 Oct 2018 23:46:52 +0900 > >>>>>> Subject: RFC [PATCH] count_lim_sig: Add pair of smp_wmb() and smp_rmb() > >>>>>> > >>>>>> This message-passing pattern requires smp_wmb()--smp_rmb() pairing. > >>>>>> > >>>>>> Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> > >>>>>> --- > >>>>>> Hi Paul, > >>>>>> > >>>>>> I'm not sure this addition of memory barriers is actually required, > >>>>>> but it does look like so. > >>>>>> > >>>>>> And I'm aware that you have avoided using weaker memory barriers in > >>>>>> CodeSamples. > >>>>>> > >>>>>> Thoughts? > >>>>> > >>>>> Hello, Akira, > >>>>> > >>>>> I might be missing something, but it looks to me like this ordering is > >>>>> covered by heavyweight ordering in the signal handler entry/exit and > >>>>> the gblcnt_mutex. So what sequence of events leads to the failiure > >>>>> scenario that you are seeing? > >>>> > >>>> So the fastpaths in add_count() and sub_count() are not protected by > >>>> glbcnt_mutex. The slowpath in flush_local_count() waits the transition > >>>> of theft from REQ to READY, clears counter and countermax, and finally > >>>> assign IDLE to theft. > >>>> > >>>> So, the fastpaths can see (theft == IDLE) but see a non-zero value of > >>>> counter or countermax, can't they? > >>> > >>> Maybe, maybe not. Please lay out a sequence of events showing a problem, > >>> as in load by load, store by store, line by line. Intuition isn't as > >>> helpful as one might like for this kind of stuff. ;-) > >> > >> Gotcha! > >> > >> I've not exhausted the timing variations, but now I see when > >> split_local_count() sees (*theft@[t] == THEFT_READY), counter part of > >> add_count() or sub_count() has exited the fastpath (marked by > >> counting == 1). > >> > >> So the race I imagined has never existed. > > > > I know that feeling!!! > > > >> Thanks for your nice suggestion! > > > > Well, there might well be another race. My main concern is whether or not > > signal-handler entry/exit really provides full ordering on all platforms. > > > > Thoughts? > > Does your concern related to the lack of memory barrier at the entry of > flush_local_count_sig() in Listing 5.17? Placing memory barriers at flush_local_count_sig() would certainly make the code independent of the kernel's ordering, but would those barriers really be needed? If they are needed, would lighter-weight synchronization work? Thanx, Paul > Akira > > > > > Thanx, Paul > > > >>>> One theory to prevent this from happening is because all the per-thread > >>>> variables of a thread reside in a single cache line, and if the fastpaths > >>>> see the updated value of theft, they are guaranteed to see the latest > >>>> values of both counter and countermax. > >>> > >>> Good point, but we need to avoid that sort of assumption unless we > >>> placed the variables into a struct and told the compiler to align it > >>> appropriately. And even then, hardware architectures normally don't > >>> make this sort of guarantee. There is too much that can go wrong, from > >>> ECC errors to interrupts at just the wrong time, and much else besides. > >> > >> Absolutely! > >> > >> Thanks, Akira > >> > >>> > >>> Thanx, Paul > >>> > >>>> I might be completely missing something, though. > >>>> > >>>> Thanks, Akira > >>>> > >>>>> > >>>>> Thanx, Paul > >>>>> > [...] >