Re: [PATCH v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations

Chuck Lever <chuck.lever@xxxxxxxxxx> · Wed, 22 Jan 2025 10:04:01 -0500

On 1/22/25 4:05 AM, Amir Goldstein wrote:
On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@xxxxxxx> wrote:

On Wed, 22 Jan 2025, Amir Goldstein wrote:
On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:

Please send patches To: the NFSD reviewers listed in MAINTAINERS and
Cc: linux-nfs and others. Thanks!

On 1/21/25 5:39 AM, Amir Goldstein wrote:
Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
mapped EBUSY host error from rmdir/unlink operation to avoid unknown
error server warning.

The same reason that casued the reported EBUSY on rmdir() (dir is a
local mount point in some other bind mount) could also cause EBUSY on
rename and some filesystems (e.g. FUSE) can return EBUSY on other
operations like open().

Therefore, to avoid unknown error warning in server, we need to map
EBUSY for all operations.

The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
to NFS4ERR_ACCESS in v2/v3 server.

During the discussion on this issue, Trond claimed that the mapping
made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
for directories.

NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
file system objects. Here's what I find in RFC 8881 Section 18.25.4:

  > If a file has an outstanding OPEN and this prevents the removal of the
  > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.

It's not normative, but it does suggest that any object that cannot be
associated with an OPEN state ID should never cause REMOVE to return
NFS4ERR_FILE_OPEN.

To keep things simple and consistent and avoid the server warning,
map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.

Generally a "one size fits all" mapping for these status codes is
not going to cut it. That's why we have nfsd3_map_status() and
nfsd_map_status() -- the set of permitted status codes for each
operation is different for each NFS version.

NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.

NFSv4 has only REMOVE, and it removes the directory entry for the
object no matter its type. The set of failure modes is different for
this operation compared to NFSv3 REMOVE.

Adding a specific mapping for -EBUSY in nfserrno() is going to have
unintended consequences for any VFS call NFSD might make that returns
-EBUSY.

I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
trigger a warning.

Sorry, I didn't understand what you are suggesting.

I'm saying that we need to consider the errno -> NFS status code
mapping on a case-by-case basis first.

FUSE can return EBUSY for open().
What do you suggest to do when nfsd encounters EBUSY on open()?

vfs_rename() can return EBUSY.
What do you suggest to do when nfsd v3 encounters EBUSY on rename()?

I totally agree that we do not want NFSD to leak -EBUSY to NFS clients.

But we do need to examine all the ways -EBUSY can leak through to the
NFS protocol layers (nfs?proc.c). The mapping is not going to be the
same for every NFS operation in every NFS version. (or, at least we
need to examine these cases closely and decide that nfserr_access is
the closest we can get for /every/ case).

This sort of assertion:
         WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);

Is a code assertion for a situation that should not be possible in the
code and certainly not possible to trigger by userspace.

Both cases above could trigger the warning from userspace.
If you want to leave the warning it should not be a WARN_ONCE()
assertion, but I must say that I did not understand the explanation
for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().

My answer to this last question is that it isn't obvious that EBUSY
should map to NFS4ERR_ACCESS.
I would rather that nfsd explicitly checked the error from unlink/rmdir and
mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
comment (like we have now) explaining why it is best.

Can you please suggest the text for this comment because I did not
understand the reasoning for the error.
All I argued for is conformity to NFSv2/3, but you are the one who chose
NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
reasoning for this error code. All I have is:
"For NFSv3, the best we can do is probably NFS3ERR_ACCES,
   which isn't true, but is not less true than the other options."

You're proposing to change the behavior of NFSv4 to match NFSv2/3, and
that's where we might need to take a moment. The NFSv4 protocol provides
a richer set of status codes to report this situation.

And nfsd should explicitly check the error from open() and map EBUSY to
whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
is, the reason is likely different to the reason that it is best for
rmdir.
So again, I would like a comment in the code explaining the choice with
a reference to FUSE.

My specific FUSE filesystem can return EBUSY for open(), but FUSE
filesystem in general can return EBUSY for any operation if that is what
the userspace server returns.

Fair, that suggests that eventually we might need the general nfserrno
mapping in addition to some individual checking in NFS operation- and
version-specific code. I'm not ready to leap to that conclusion yet.

Then if some other function that we haven't thought about starts
returning EBUSY, we'll get warning and have a change to think about it.

I have no objection to that, but I think that the WARN_ONCE should be
converted to pr_warn_once() or pr_warn_ratelimited() because userspace
should not be able to trigger a WARN_ON in any case.

It isn't user space that's the concern -- it's that NFS clients can
trigger this warning. If a client accidentally or maliciously triggers
it repeatedly, it can fill the NFS server's system journal.

Our general policy is that we use the _ONCE variants to prevent a remote
attack from overflowing the server's root partition.

I realize the great value of the stack trace that WARN_ON provides in
this scenario, but if we include in the warning the operation id and the
filesystem sid I think that would be enough information to understand
where the unmapped error is coming from.

Hm. The stack trace tells us all that without having to add the extra
(rarely used) arguments to nfserrno. I'm OK with a stack trace here
because this is information for developers, who can make sense of it.
It's not a message that a system admin or user needs to understand.

This is not expected stop the whack-a-mole of patches like mine and this one:

It will be whack-a-mole. Unfortunately there is no way around that
because of the case-by-case auditing that is necessary.

340e61e44c1d2 ("nfsd: map the EBADMSG to nfserr_io to avoid warning")
but at least the severity of the issues will be reduced without the scary
WARN_ON splat.

340e61e44c1d2 is one tactic for dealing with these issues, but we do
need case-by-case audit to have confidence that this tactic is
appropriate for each errno value.

I can write a patch if there is an agreement on that.

Thanks,
Amir.

--
Chuck Lever