On Wed, 04 Jan 2023, NeilBrown wrote: > On Wed, 04 Jan 2023, Olga Kornievskaia wrote: > > On Tue, Jan 3, 2023 at 7:46 PM Trond Myklebust <trondmy@xxxxxxxxxx> wrote: > > > > > > > > > If the server starts to reply NFS4ERR_STALE to GETATTR requests, why do > > > we care about stateid values? > > > > It is acceptable for the server to return ESTALE to the GETATTR after > > the processing the open (due to a REMOVE that comes in) and that open > > generating a valid stateid which client should care about when there > > are pre-existing opens. The server will keep the state of an existing > > opens valid even if the file is removed. Which is what's happening, > > the previous open is being used for IO but the stateid is updated on > > the server but not on the client. > > I agree that it is acceptable to return ESTALE to the GETATTR, but > having done that I don't think it is acceptable for a PUTFH of the same > filehandle to succeed. Certainly any attempt to again use the > filehandle after the PUTFH should fail with NFS4ERR_STALE. > > RFC7530 says > > 13.1.2.7. NFS4ERR_STALE (Error Code 70) > > The current or saved filehandle value designating an argument to the > current operation is invalid. The file system object referred to by > that filehandle no longer exists, or access to it has been revoked. > > So the file doesn't exist or access has been revoked. So any writes > should fail. Failing with OLD_STATEID is weird - and having writes > succeed if we use the correct stateid is also odd. Failing with STALE > would be perfectly sensible and I suspect the Linux client would handle > that just fine. I checked a recent tcpdump (with patched SLE kernel talking to Netapp) and I see that the writes don't succeed after the first NFS4ERR_STALE. If the "correct" stateid is given to WRITE, it returns NFS4ERR_STALE. If the older stateid is given to WRITE, it returns NFS4ERR_OLD_STATEID. So it seems that it just has these two checks in the wrong order. NeilBrown