Re: [PATCH rfc] nfs: propagate readlink errors in nfs_symlink_filler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 21/05/2024 18:13, Trond Myklebust wrote:
On Tue, 2024-05-21 at 18:05 +0300, Sagi Grimberg wrote:

On 21/05/2024 16:22, Jeff Layton wrote:
On Tue, 2024-05-21 at 15:58 +0300, Sagi Grimberg wrote:
There is an inherent race where a symlink file may have been
overriden
(by a different client) between lookup and readlink, resulting in
a
spurious EIO error returned to userspace. Fix this by propagating
back
ESTALE errors such that the vfs will retry the lookup/get_link
(similar
to nfs4_file_open) at least once.

Cc: Dan Aloni <dan.aloni@xxxxxxxxxxxx>
Signed-off-by: Sagi Grimberg <sagi@xxxxxxxxxxx>
---
Note that with this change the vfs should retry once for
ESTALE errors. However with an artificial reproducer of high
frequency symlink overrides, nothing prevents the retry to
also encounter ESTALE, propagating the error back to userspace.
The man pages for openat/readlinkat do not list an ESTALE errno.

An alternative attempt (implemented by Dan) was a local retry
loop
in nfs_get_link(), if this is an applicable approach, Dan can
share his patch instead.

   fs/nfs/symlink.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/symlink.c b/fs/nfs/symlink.c
index 0e27a2e4e68b..13818129d268 100644
--- a/fs/nfs/symlink.c
+++ b/fs/nfs/symlink.c
@@ -41,7 +41,7 @@ static int nfs_symlink_filler(struct file
*file, struct folio *folio)
   error:
   	folio_set_error(folio);
   	folio_unlock(folio);
-	return -EIO;
+	return error;
   }
  static const char *nfs_get_link(struct dentry *dentry,
git blame seems to indicate that we've returned -EIO here since the
beginning of the git era (and likely long before that). I see no
reason
for us to cloak the real error there though, especially with
something
like an ESTALE error.

      Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>

FWIW, I think we shouldn't try to do any retry looping on ESTALE
beyond
what we already do.

Yes, we can sometimes trigger ESTALE errors to bubble up to
userland if
we really thrash the underlying filesystem when testing, but I
think
that's actually desirable:
Returning ESTALE would be an improvement over returning EIO IMO,
but it may be surprising for userspace to see an undocumented errno.
Maybe the man pages can be amended?

If you have real workloads across multiple machines that are racing
with other that tightly, then you should probably be using some
sort of
locking or other synchronization. If it's clever enough that it
doesn''t need that, then it should be able to deal with the
occasional
ESTALE error by retrying on its own.
I tend to agree. FWIW Solaris has a config knob for number of stale
retries
it does, maybe there is an appetite to have something like that as
well?

Any reason why we couldn't just return ENOENT in the case where the
filehandle is stale? There will have been an unlink() on the symlink at
some point in the recent past.


No reason that I can see. However given that this was observed in the wild, and essentially a common pattern with symlinks (overwrite a config file for example), I think its reasonable
to have the vfs at least do a single retry, by simply returning ESTALE.
However NFS cannot distinguish between first and second retries afaict... Perhaps the
vfs can help with a ESTALE->ENOENT conversion?




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux