Re: [PATCH 07/10] netfs: Fix missing barriers by using clear_and_wake_up_bit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Howells wrote:
> [Adding Paul McKenney as he's the expert.]
> 
> Akira Yokosawa <akiyks@xxxxxxxxx> wrote:
> 
>> David Howells wrote:
>>> Use clear_and_wake_up_bit() rather than something like:
>>>
>>> 	clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags);
>>> 	wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS);
>>>
>>> as there needs to be a barrier inserted between which is present in
>>> clear_and_wake_up_bit().
>>
>> If I am reading the kernel-doc comment of clear_bit_unlock() [1, 2]:
>>
>>     This operation is atomic and provides release barrier semantics.
>>
>> correctly, there already seems to be a barrier which should be
>> good enough.
>>
>> [1]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.clear_bit_unlock
>> [2]: include/asm-generic/bitops/instrumented-lock.h
>>
>>>
>>> Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
>>> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading")
>>
>> So I'm not sure this fixes anything.
>>
>> What am I missing?
> 
> We may need two barriers.  You have three things to synchronise:
> 
>  (1) The stuff you did before unlocking.
> 
>  (2) The lock bit.
> 
>  (3) The task state.
> 
> clear_bit_unlock() interposes a release barrier between (1) and (2).
> 
> Neither clear_bit_unlock() nor wake_up_bit(), however, necessarily interpose a
> barrier between (2) and (3).

Got it!

I was confused because I compared kernel-doc comments of clear_bit_unlock()
and clear_and_wake_up_bit() only, without looking at latter's code.

clear_and_wake_up_bit() has this description in its kernel-doc:

 * The designated bit is cleared and any tasks waiting in wait_on_bit()
 * or similar will be woken.  This call has RELEASE semantics so that
 * any changes to memory made before this call are guaranteed to be visible
 * after the corresponding wait_on_bit() completes.

, without any mention of additional full barrier at your (3) above.

It might be worth mentioning it there.

Thoughts?

FWIW,

Reviewed-by: Akira Yokosawa <akiyks@xxxxxxxxx>

>                               I'm not sure it entirely matters, but it seems
> that since we have a function that combines the two, we should probably use
> it - though, granted, it might not actually be a fix.

Looks like it should matter where smp_mb__after_atomic() is stronger than
a plain barrier().

Akira





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux