On Fri, 7 Apr 2023 00:29:21 +0800, Alan Huang wrote: > Hi Akira, > >> Do you have a good reasoning why the pair of smp_mb()s can be safely >> replaced with the single smp_store_release() ? > > When the signal handler is running, there is only one thread waiting on the while loop in flush_local_count, > which means we only need to synchronize with that thread, and that thread only cares the THEFT_READY status, > we just need to make sure after that thread seeing theft equals THEFT_READY, it can also see the changes to counter. > > And because the signal handler is running on the interrupted thread, we don’t need any memory barrier. OK, now I'm convinced. You are right! Let's see your changes in corresponding text: > diff --git a/count/count.tex b/count/count.tex > index 80ada104..899ea7e9 100644 > --- a/count/count.tex > +++ b/count/count.tex > @@ -2425,12 +2425,11 @@ handler used in the theft process. > \Clnref{check:REQ,return:n} check to see if > the \co{theft} state is REQ, and, if not > returns without change. > -\Clnref{mb:1} executes a \IX{memory barrier} to ensure that the sampling > -of the theft variable happens before any change to that variable. > \Clnref{set:ACK} sets the \co{theft} state to ACK, and, if > \clnref{check:fast} sees that > this thread's fastpaths are not running, \clnref{set:READY} sets the \co{theft} > -state to READY\@. > +state to READY, the release store here is to ensure that the change to counter > +in fast path happens before changing the theft varibale to READY\@. the fastpath variable I can't parse this sentence grammatically. This sentence is rendered as follows (with minor nits fixed): Line 13 sets the theft state to ACK, and, if line 14 sees that this thread's fastpaths are not running, line 15 sets the theft state to READY, the release store here is to ensure that the change to counter in the fastpath happens before changing the theft variable to READY. How about this? Line 13 sets the theft state to ACK, and, if line 14 sees that this thread's fastpaths are not running, line 15 sets the theft state to READY, with the release store ensuring any change to counter in the fastpath happens before the change of theft to READY. Also, I'd like to see your reasoning above mentioned somewhere around here. Might worth a Quick Quiz. Question: Doesn't flush_local_count_sig() need a full memory barrier? > \end{fcvref} > > \begin{listing} > @@ -2595,8 +2594,8 @@ handlers to undertake theft. > \Clnref{barrier:3} again disables compiler reordering, and then > \clnref{check:ACK} > checks to see if the signal handler deferred the \co{theft} > -state-change to READY, and, if so, \clnref{mb} executes a memory > -barrier to ensure that any CPU that sees \clnref{READY} setting state to > +state-change to READY, and, if so, the release store at \clnref{READY} > +is to ensure that any CPU that sees \clnref{READY} setting state to > READY also sees the effects of \clnref{add:f}. Again, this looks to me incomplete in grammatical structure. This sentence is rendered as follows: Line 14 again disables compiler reordering, and then line 15 checks to see if the signal handler deferred the theft state-change to READY, and, if so, the release store at line 16 is to ensure that any CPU that sees line 16 setting state to READY also sees the effects of line 9. How about something like this? Line 14 again disables compiler reordering, and then line 15 checks to see if the signal handler deferred the theft state-change to READY, and, if so, line 16 changes theft to READY with the release store ensuring that any CPU that sees the READY state also sees the effects of line 9. Paul might have better ideas. ;-) > If the fastpath addition at \clnref{add:f} was executed, then > \clnref{return:fs} returns Can you post a v3 with whatever text-side updates you come up with? Thanks, Akira > > Please correct me if I'm wrong. > > Thanks, > Alan >