Re: [PATCH rfc] nfs: propagate readlink errors in nfs_symlink_filler

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 22 May 2024 17:20:07 -0400

On Wed, 2024-05-22 at 21:04 +0000, Trond Myklebust wrote:
> On Wed, 2024-05-22 at 22:40 +0300, Sagi Grimberg wrote:
> > Hey Trond,
> > 
> > > > filehandle is stale? There will have been an unlink() on the
> > > > symlink at
> > > > some point in the recent past.
> > > > 
> > > 
> > > No reason that I can see. However given that this was observed in
> > > the 
> > > wild, and essentially
> > > a common pattern with symlinks (overwrite a config file for
> > > example), 
> > > I think its reasonable
> > > to have the vfs at least do a single retry, by simply returning
> > > ESTALE.
> > > However NFS cannot distinguish between first and second retries 
> > > afaict... Perhaps the
> > > vfs can help with a ESTALE->ENOENT conversion?
> > 
> > So what do you suggest we do here? IMO at a minimum NFS should retry 
> > once similar
> > to nfs4_file_open (it would probably address 99.9% of the use-cases 
> > where symlinks are
> > not overwritten in a high enough frequency for the client to see 2 
> > consecutive stale readlink
> > rpc rplies).
> > 
> > I can send a patch paired with a vfs ESTALE conversion patch? 
> > alternatively retry locally in NFS...
> > I would like to understand your position here.
> > 
> 
> Looking more closely at nfs_get_link(), it is obvious that it can
> already return ESTALE (thanks to the call to nfs_revalidate_mapping())
> and looking at do_readlinkat(), it has already been plumbed through
> with a call to retry_estale().
> 
> So I think we can take your patch as is, since it doesn't add any error
> cases that callers of readlink() don't have to handle already.
> 
> We might still want to think about cleaning up the output of the VFS in
> all these cases, so that we don't return ESTALE when it isn't allowed
> by POSIX, but that would be a separate task.
> 

I think we can effectively turn ESTALE into ENOENT in most (all?)
syscalls that take a pathname, since you can argue that it would have
been an ENOENT had we gotten in there just a little later.

To fix this the right way, I think you'd have to plumb this translation
into most path-based syscall handlers at a fairly high level. Maybe we
need some sort of generic sanitize_errno() handler that we call from
these sorts of calls?

In any case, I think that's a somewhat larger project. :)
-- 
Jeff Layton <jlayton@xxxxxxxxxx>