Re: [PATCH rfc] nfs: propagate readlink errors in nfs_symlink_filler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-05-21 15:24:19, Chuck Lever III wrote:
> 
> 
> > On May 21, 2024, at 11:13 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Tue, 2024-05-21 at 18:05 +0300, Sagi Grimberg wrote:
> >> 
> >> 
> >> On 21/05/2024 16:22, Jeff Layton wrote:
> >>> On Tue, 2024-05-21 at 15:58 +0300, Sagi Grimberg wrote:
> >>>> There is an inherent race where a symlink file may have been
> >>>> overriden
> >>>> (by a different client) between lookup and readlink, resulting in
> >>>> a
> >>>> spurious EIO error returned to userspace. Fix this by propagating
> >>>> back
> >>>> ESTALE errors such that the vfs will retry the lookup/get_link
> >>>> (similar
> >>>> to nfs4_file_open) at least once.
> >>>> 
> >>>> Cc: Dan Aloni <dan.aloni@xxxxxxxxxxxx>
> >>>> Signed-off-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
> >>>> ---
> >>>> Note that with this change the vfs should retry once for
> >>>> ESTALE errors. However with an artificial reproducer of high
> >>>> frequency symlink overrides, nothing prevents the retry to
> >>>> also encounter ESTALE, propagating the error back to userspace.
> >>>> The man pages for openat/readlinkat do not list an ESTALE errno.
> >>>> 
> >>>> An alternative attempt (implemented by Dan) was a local retry
> >>>> loop
> >>>> in nfs_get_link(), if this is an applicable approach, Dan can
> >>>> share his patch instead.
> >>>> 
> >>>>   fs/nfs/symlink.c | 2 +-
> >>>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>> 
> >>>> diff --git a/fs/nfs/symlink.c b/fs/nfs/symlink.c
> >>>> index 0e27a2e4e68b..13818129d268 100644
> >>>> --- a/fs/nfs/symlink.c
> >>>> +++ b/fs/nfs/symlink.c
> >>>> @@ -41,7 +41,7 @@ static int nfs_symlink_filler(struct file
> >>>> *file, struct folio *folio)
> >>>>   error:
> >>>>    folio_set_error(folio);
> >>>>    folio_unlock(folio);
> >>>> - return -EIO;
> >>>> + return error;
> >>>>   }
> >>>>   
> >>>>   static const char *nfs_get_link(struct dentry *dentry,
> >>> git blame seems to indicate that we've returned -EIO here since the
> >>> beginning of the git era (and likely long before that). I see no
> >>> reason
> >>> for us to cloak the real error there though, especially with
> >>> something
> >>> like an ESTALE error.
> >>> 
> >>>      Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> >>> 
> >>> FWIW, I think we shouldn't try to do any retry looping on ESTALE
> >>> beyond
> >>> what we already do.
> >>> 
> >>> Yes, we can sometimes trigger ESTALE errors to bubble up to
> >>> userland if
> >>> we really thrash the underlying filesystem when testing, but I
> >>> think
> >>> that's actually desirable:
> >> 
> >> Returning ESTALE would be an improvement over returning EIO IMO,
> >> but it may be surprising for userspace to see an undocumented errno.
> >> Maybe the man pages can be amended?
> >> 
> >>> 
> >>> If you have real workloads across multiple machines that are racing
> >>> with other that tightly, then you should probably be using some
> >>> sort of
> >>> locking or other synchronization. If it's clever enough that it
> >>> doesn''t need that, then it should be able to deal with the
> >>> occasional
> >>> ESTALE error by retrying on its own.
> >> 
> >> I tend to agree. FWIW Solaris has a config knob for number of stale
> >> retries
> >> it does, maybe there is an appetite to have something like that as
> >> well?
> >> 
> > 
> > Any reason why we couldn't just return ENOENT in the case where the
> > filehandle is stale? There will have been an unlink() on the symlink at
> > some point in the recent past.
> 
> To me ENOENT is preferable to both EIO and ESTALE.

Another view on that, where in the scenario of `rename` causing the
unlinking, there was no situation of 'no entry' as the directory entry
was only updated and not removed. So ENOENT in this regard by the
meaning of 'no entry' would not reflect what has really happened.

(unless you go with the 'no entity' interpretation of ENOENT, but that
would be against most of the POSIX-spec cases where ENOENT is returned
which deal primarily with missing path components i.e. names to
objects and not the objects themselves)

-- 
Dan Aloni




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux