David Howells wrote: > [Adding Paul McKenney as he's the expert.] > > Akira Yokosawa <akiyks@xxxxxxxxx> wrote: > >> David Howells wrote: >>> Use clear_and_wake_up_bit() rather than something like: >>> >>> clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags); >>> wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS); >>> >>> as there needs to be a barrier inserted between which is present in >>> clear_and_wake_up_bit(). >> >> If I am reading the kernel-doc comment of clear_bit_unlock() [1, 2]: >> >> This operation is atomic and provides release barrier semantics. >> >> correctly, there already seems to be a barrier which should be >> good enough. >> >> [1]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.clear_bit_unlock >> [2]: include/asm-generic/bitops/instrumented-lock.h >> >>> >>> Fixes: 288ace2f57c9 ("netfs: New writeback implementation") >>> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") >> >> So I'm not sure this fixes anything. >> >> What am I missing? > > We may need two barriers. You have three things to synchronise: > > (1) The stuff you did before unlocking. > > (2) The lock bit. > > (3) The task state. > > clear_bit_unlock() interposes a release barrier between (1) and (2). > > Neither clear_bit_unlock() nor wake_up_bit(), however, necessarily interpose a > barrier between (2) and (3). Got it! I was confused because I compared kernel-doc comments of clear_bit_unlock() and clear_and_wake_up_bit() only, without looking at latter's code. clear_and_wake_up_bit() has this description in its kernel-doc: * The designated bit is cleared and any tasks waiting in wait_on_bit() * or similar will be woken. This call has RELEASE semantics so that * any changes to memory made before this call are guaranteed to be visible * after the corresponding wait_on_bit() completes. , without any mention of additional full barrier at your (3) above. It might be worth mentioning it there. Thoughts? FWIW, Reviewed-by: Akira Yokosawa <akiyks@xxxxxxxxx> > I'm not sure it entirely matters, but it seems > that since we have a function that combines the two, we should probably use > it - though, granted, it might not actually be a fix. Looks like it should matter where smp_mb__after_atomic() is stronger than a plain barrier(). Akira