Re: [PATCH v2] NFSD: Fix RCU-related sparse splat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Nov 30, 2021, at 8:38 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Wed, 01 Dec 2021, Chuck Lever III wrote:
>> 
>> By way of further explanation:
>> 
>> The Documentation/ for RCU (ie, "RCU for Dummies") suggests that
>> rcu_assign_pointer() is adequate for preventing undesirable
>> compiler optimizations or CPU load/store reordering.
>> 
>> rcu_assign_pointer() uses WRITE_ONCE for constants like NULL and
>> smp_store_release() when assigning variable values. I copied the
>> use of WRITE_ONCE() from rcu_assign_pointer(). So I expect exactly
>> zero change in behavior or compiled code... (but I should have
>> checked the generated object to verify that is the case).
> 
> True, there would be no change in behaviour - but we should at least ask
> if that behaviour is correct, and why.
> 
> If any barriers are needed here, they would have to be between the
> assignment of NULL here, and the tests of l->net in
>   nfsd_file_list_add_disposal()
>   nfsd_free_fcache_disposal_net()
> (because they are the only places ->net is used)
> The only conceivable race is that they will see a value in ->net "after"
> NULL has been assigned.
> If there were a race there, it would be between different cpus, so
> smp_store_relase() and smp_load_acquire()
> would be the correct tools to avoid that race.

smp_load_acquire() is not used in those functions, so one might
draw the conclusion that either the RCU annotations are incorrect,
or that degree of concurrence paranoia (including using
rcu_assign_pointer() to assign the NULL) is unnecessary.


> If, on the other hand, there is no chance of a race, then there is no
> need to assign NULL to ->net at all.  I believe this is actually the
> case.
> As the 'net' is freed using kfree_rcu, there is no possibility for a
> search that started before something was removed from the list to get a
> false-positive when testing if l->net == net.
> 
> So while your change is safe from a behavioural perspective, I don't
> think it is safe from a maintenance perspective because it leaves in
> place code that doesn't really make sense, but removes the warning that
> helps us to find that nonsense.

Well I agree that the code is either hard to follow, incomplete,
or incorrect, and therefore needs to be cleaned up so it does not
accrue further technical debt.


>> Sure. If it makes sense to use nfsd_net instead of a separate data
>> structure, then that can be made to happen. I would like to hear
>> from Trond regarding why he felt a separate data structure was
>> necessary for fcache_disposal.
> 
> That would be best, yes.

Since Trond was active recently, I think we can conclude from his
silence he does not care and we are free to act as we see fit.

I'm comfortable starting with https://lore.kernel.org/linux-nfs/163831669455.26075.14420575954675785122@xxxxxxxxxxxxxxxxxxxxx/raw

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux