On Mon, May 25, 2020 at 7:53 AM Boqun Feng <boqun.feng@xxxxxxxxx> wrote: > > Hi Andrii, > > On Fri, May 22, 2020 at 12:38:21PM -0700, Andrii Nakryiko wrote: > > On 5/22/20 10:43 AM, Paul E. McKenney wrote: > > > On Fri, May 22, 2020 at 10:32:01AM -0400, Alan Stern wrote: > > > > On Fri, May 22, 2020 at 11:44:07AM +0200, Peter Zijlstra wrote: > > > > > On Thu, May 21, 2020 at 05:38:50PM -0700, Paul E. McKenney wrote: > > > > > > Hello! > > > > > > > > > > > > Just wanted to call your attention to some pretty cool and pretty serious > > > > > > litmus tests that Andrii did as part of his BPF ring-buffer work: > > > > > > > > > > > > https://lore.kernel.org/bpf/20200517195727.279322-3-andriin@xxxxxx/ > > > > > > > > > > > > Thoughts? > > > > > > > > > > I find: > > > > > > > > > > smp_wmb() > > > > > smp_store_release() > > > > > > > > > > a _very_ weird construct. What is that supposed to even do? > > > > > > > > Indeed, it looks like one or the other of those is redundant (depending > > > > on the context). > > > > > > Probably. Peter instead asked what it was supposed to even do. ;-) > > > > I agree, I think smp_wmb() is redundant here. Can't remember why I thought > > that it's necessary, this algorithm went through a bunch of iterations, > > starting as completely lockless, also using READ_ONCE/WRITE_ONCE at some > > point, and settling on smp_read_acquire/smp_store_release, eventually. Maybe > > there was some reason, but might be that I was just over-cautious. See reply > > on patch thread as well ([0]). > > > > [0] https://lore.kernel.org/bpf/CAEf4Bza26AbRMtWcoD5+TFhnmnU6p5YJ8zO+SoAJCDtp1jVhcQ@xxxxxxxxxxxxxx/ > > > > While we are at it, could you explain a bit on why you use > smp_store_release() on consumer_pos? I ask because IIUC, consumer_pos is > only updated at consumer side, and there is no other write at consumer > side that we want to order with the write to consumer_pos. So I fail > to find why smp_store_release() is necessary. > > I did the following modification on litmus tests, and I didn't see > different results (on States) between two versions of litmus tests. > This is needed to ensure that producer can reliably detect whether it needs to trigger poll notification. Basically, consumer caught up at about same time as producer commits new record, we need to make sure that: - either consumer sees updated producer_pos > consumer_pos, and thus knows that there is more data to consumer (but producer might not send notification of new data in this case); - or producer sees that consumer already caught up (i.e., consumer_pos == producer_pos before currently committed record), and in such case will definitely send notifications. This is critical for correctness of epoll notifications. Unfortunately, litmus tests don't test this notification aspect, as I haven't originally figured out the invariant that can be defined to validate this. I'll give it another thought, though, maybe this time I'll come up with something. > Regards, > Boqun > [...]