Re: [PATCH] NFS: Handle missing attributes in OPEN reply

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Mon, 9 Jan 2023 17:11:31 +0000

> On Jan 9, 2023, at 11:07, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> On Mon, Jan 9, 2023 at 10:47 AM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>> 
>> 
>> 
>>> On Jan 9, 2023, at 10:33, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>>> 
>>> On Thu, Jan 5, 2023 at 10:48 PM NeilBrown <neilb@xxxxxxx> wrote:
>>>> 
>>>> On Fri, 06 Jan 2023, Trond Myklebust wrote:
>>>>> On Fri, 2023-01-06 at 09:56 +1100, NeilBrown wrote:
>>>>>> On Wed, 04 Jan 2023, NeilBrown wrote:
>>>>>>> On Wed, 04 Jan 2023, Trond Myklebust wrote:
>>>>>>>> On Wed, 2023-01-04 at 12:01 +1100, NeilBrown wrote:
>>>>>>>>> On Wed, 04 Jan 2023, Trond Myklebust wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> If the server starts to reply NFS4ERR_STALE to GETATTR
>>>>>>>>>> requests,
>>>>>>>>>> why do
>>>>>>>>>> we care about stateid values? Just mark the inode as stale
>>>>>>>>>> and drop
>>>>>>>>>> it
>>>>>>>>>> on the floor.
>>>>>>>>> 
>>>>>>>>> We have a valid state from the server - we really shouldn't
>>>>>>>>> just
>>>>>>>>> ignore
>>>>>>>>> it.
>>>>>>>>> 
>>>>>>>>> Maybe it would be OK to mark the inode stale.  I don't know if
>>>>>>>>> various
>>>>>>>>> retry loops abort properly when the inode is stale.
>>>>>>>> 
>>>>>>>> Yes, they are all supposed to do that. Otherwise we would end up
>>>>>>>> looping forever in close(), for instance, since the PUTFH,
>>>>>>>> GETATTR and
>>>>>>>> CLOSE can all return NFS4ERR_STALE as well.
>>>>>>> 
>>>>>>> To mark the inode as STALE we still need to find the inode, and
>>>>>>> that is
>>>>>>> the key bit missing in the current code.  Once we find the inode,
>>>>>>> we
>>>>>>> could mark it stale, but maybe some other error resulted in the
>>>>>>> missing
>>>>>>> GETATTR...
>>>>>>> 
>>>>>>> It might make sense to put the new code in _nfs4_proc_open() after
>>>>>>> the
>>>>>>> explicit nfs4_proc_getattr() fails.  We would need to find the
>>>>>>> inode
>>>>>>> given only the filehandle.  Is there any easy way to do that?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> NeilBrown
>>>>>>> 
>>>>>> 
>>>>>> I couldn't see a consistent pattern to follow for when to mark an
>>>>>> inode
>>>>>> as stale.  Do this, on top of the previous patch, seem reasonable?
>>>>>> 
>>>>>> Thanks,
>>>>>> NeilBrown
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>>>>>> index b441b1d14a50..04497cb42154 100644
>>>>>> --- a/fs/nfs/nfs4proc.c
>>>>>> +++ b/fs/nfs/nfs4proc.c
>>>>>> @@ -489,6 +489,8 @@ static int nfs4_do_handle_exception(struct
>>>>>> nfs_server *server,
>>>>>>               case -ESTALE:
>>>>>>                       if (inode != NULL && S_ISREG(inode->i_mode))
>>>>>>                               pnfs_destroy_layout(NFS_I(inode));
>>>>>> +                       if (inode)
>>>>>> +                               nfs_set_inode_stale(inode);
>>>>> 
>>>>> This is normally dealt with in the generic code inside
>>>>> nfs_revalidate_inode(). There should be no need to duplicate it here.
>>>>> 
>>>>>>                       break;
>>>>>>               case -NFS4ERR_DELEG_REVOKED:
>>>>>>               case -NFS4ERR_ADMIN_REVOKED:
>>>>>> @@ -2713,8 +2715,12 @@ static int _nfs4_proc_open(struct
>>>>>> nfs4_opendata *data,
>>>>>>                       return status;
>>>>>>       }
>>>>>>       if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) {
>>>>>> +               struct inode *inode = nfs4_get_inode_by_stateid(
>>>>>> +                       &data->o_res.stateid,
>>>>>> +                       data->owner);
>>>>> 
>>>>> There shouldn't be a need to go looking for open descriptors here,
>>>>> because they will hit ESTALE at some point later anyway.
>>>> 
>>>> The problem is that they don't hit ESTALE later.  Unless we update our
>>>> stored stateid with the result of the OPEN, we can use the old stateid,
>>>> and get the corresponding error.
>>>> 
>>>> The only way to avoid the infinite loop is to either mark the inode as
>>>> stale, or update the state-id.  For either of those we need to find the
>>>> inode.  We don't have fileid so we cannot use iget.  We do have file
>>>> handle and state-id.
>>>> 
>>>> Maybe you are saying this is a server bug that the client cannot be
>>>> expect to cope with at all, and that an infinite loop is a valid client
>>>> response to this particular server behaviour.  In that case, I guess no
>>>> patch is needed.
>>> 
>>> I'm not arguing that the server should do something else. But I would
>>> like to talk about it from the spec perspective. When PUTFH+WRITE is
>>> sent to the server (with an incorrect stateid) and it's processing the
>>> WRITE compound (as the spec doesn't require the server to validate a
>>> filehandle on PUTFH. The spec says PUTFH is to "set" the correct
>>> filehandle), the server is dealing with 2 errors that it can possibly
>>> return to the client ERR_STALE and ERR_OLD_STATEID. There is nothing
>>> in the spec that speaks to the orders of errors to be returned. Of
>>> course I'd like to say that the server should prioritize ERR_STALE
>>> over ERR_OLD_STATEID (simply because it's a MUST in the spec and
>>> ERR_OLD_STATEIDs are written as "rules").
>>> 
>> 
>> I disagree for the reason already pointed to in the spec. There is nothing in the spec that appears to allow the PUTFH to return anything other than NFS4ERR_STALE after the file has been deleted (and yes, RFC5661, Section 15.2 does list NFS4ERR_STALE as an error for PUTFH). PUTFH is definitely required to validate the file handle, since it is the ONLY operation that can return NFS4ERR_BADHANDLE.
> 
> We are talking about 4.0 and not 4.1. In 4.0 all operations that use
> PUTFH can return ERR_BADHANDLE.

Same difference: RFC7530 Section 4.2.2 uses the exact same language as in RFC5661 to describe how file handles work for deleted files, and RFC7530 Section 13.2 list NFS4ERR_STALE as being a valid error for PUTFH.

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx