Re: [PATCH v15 16/26] nfsd: add LOCALIO support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 5, 2024, at 10:21 AM, Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> 
> On Wed, Sep 04, 2024 at 09:47:07AM -0400, Chuck Lever wrote:
>> On Wed, Sep 04, 2024 at 03:01:46PM +1000, NeilBrown wrote:
>>> On Wed, 04 Sep 2024, NeilBrown wrote:
>>>> 
>>>> I agree that dropping and reclaiming a lock is an anti-pattern and in
>>>> best avoided in general.  I cannot see a better alternative in this
>>>> case.
>>> 
>>> It occurred to me what I should spell out the alternate that I DO see so
>>> you have the option of disagreeing with my assessment that it isn't
>>> "better".
>>> 
>>> We need RCU to call into nfsd, we need a per-cpu ref on the net (which
>>> we can only get inside nfsd) and NOT RCU to call
>>> nfsd_file_acquire_local().
>>> 
>>> The current code combines these (because they are only used together)
>>> and so the need to drop rcu. 
>>> 
>>> I thought briefly that it could simply drop rcu and leave it dropped
>>> (__releases(rcu)) but not only do I generally like that LESS than
>>> dropping and reclaiming, I think it would be buggy.  While in the nfsd
>>> module code we need to be holding either rcu or a ref on the server else
>>> the code could disappear out from under the CPU.  So if we exit without
>>> a ref on the server - which we do if nfsd_file_acquire_local() fails -
>>> then we need to reclaim RCU *before* dropping the ref.  So the current
>>> code is slightly buggy.
>>> 
>>> We could instead split the combined call into multiple nfs_to
>>> interfaces.
>>> 
>>> So nfs_open_local_fh() in nfs_common/nfslocalio.c would be something
>>> like:
>>> 
>>> rcu_read_lock();
>>> net = READ_ONCE(uuid->net);
>>> if (!net || !nfs_to.get_net(net)) {
>>>       rcu_read_unlock();
>>>       return ERR_PTR(-ENXIO);
>>> }
>>> rcu_read_unlock();
>>> localio = nfs_to.nfsd_open_local_fh(....);
>>> if (IS_ERR(localio))
>>>       nfs_to.put_net(net);
>>> return localio;
>>> 
>>> So we have 3 interfaces instead of 1, but no hidden unlock/lock.
>> 
>> Splitting up the function call occurred to me as well, but I didn't
>> come up with a specific bit of surgery. Thanks for the suggestion.
>> 
>> At this point, my concern is that we will lose your cogent
>> explanation of why the release/lock is done. Having it in email is
>> great, but email is more ephemeral than actually putting it in the
>> code.
>> 
>> 
>>> As I said, I don't think this is a net win, but reasonable people might
>>> disagree with me.
>> 
>> The "win" here is that it makes this code self-documenting and
>> somewhat less likely to be broken down the road by changes in and
>> around this area. Since I'm more forgetful these days I lean towards
>> the more obvious kinds of coding solutions. ;-)
>> 
>> Mike, how do you feel about the 3-interface suggestion?
> 
> I dislike expanding from 1 indirect function call to 2 in rapid
> succession (3 for the error path, not a problem, just being precise.
> But I otherwise like it.. maybe.. heh.
> 
> FYI, I did run with the suggestion to make nfs_to a pointer that just
> needs a simple assignment rather than memcpy to initialize.  So Neil's
> above code becames:
> 
>        rcu_read_lock();
>        net = rcu_dereference(uuid->net);
>        if (!net || !nfs_to->nfsd_serv_try_get(net)) {
>                rcu_read_unlock();
>                return ERR_PTR(-ENXIO);
>        }
>        rcu_read_unlock();
>        /* We have an implied reference to net thanks to nfsd_serv_try_get */
>        localio = nfs_to->nfsd_open_local_fh(net, uuid->dom, rpc_clnt,
>                                             cred, nfs_fh, fmode);
>        if (IS_ERR(localio))
>                nfs_to->nfsd_serv_put(net);
>        return localio;
> 
> I do think it cleans the code up... full patch is here:
> https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/commit/?h=nfs-localio-for-next.v15-with-fixups&id=e85306941878a87070176702de687f2779436061
> 
> But I'm still on the fence.. someone help push me over!

I wasn't expecting it would be less ugly, but it does look
harder to screw up down the road when we've forgotten why
the API looks this way.


> Tangent, but in the related business of "what are next steps?":
> 
> I updated headers with various provided Reviewed-by:s and Acked-by:s,
> fixed at least 1 commit header, fixed some sparse issues, various
> fixes to nfs_to patch (removed EXPORT_SYMBOL_GPL, switched to using
> pointer, updated nfs_to callers). Etc...
> 
> But if I fold those changes in I compromise the provided Reviewed-by
> and Acked-by.. so I'm leaning toward posting a v16 that has
> these incremental fixes/improvements, see the 3 topmost commits here:
> https://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git/log/?h=nfs-localio-for-next.v15-with-fixups
> 
> Or if you can review the incremental patches I can fold them in and
> preserve the various Reviewed-by and Acked-by...

For the three topmost patches in that branch:

Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx <mailto:chuck.lever@xxxxxxxxxx>>

HTH


> You can also see incremental diff from .v15 to .v15-with-fixups with:
> git remote update snitzer
> git diff snitzer/nfs-localio-for-next.v15 snitzer/nfs-localio-for-next.v15-with-fixups
> 
> Either way, I should post a v16 right?  SO question is: should I fold
> these incremental changes in to the original or keep them split out?
> 
> I'm good with whatever you guys think.  But whatever is decided: this
> needs to be the handoff point to focused NFS client review and hopeful
> staging for 6.12 inclusion, I've pivoted to working with Trond to
> make certain he is good with everything.


--
Chuck Lever






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux