Re: [PATCH v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 22, 2025 at 4:04 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
> On 1/22/25 4:05 AM, Amir Goldstein wrote:
> > On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@xxxxxxx> wrote:
> >>
> >> On Wed, 22 Jan 2025, Amir Goldstein wrote:
> >>> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>>>
> >>>> Please send patches To: the NFSD reviewers listed in MAINTAINERS and
> >>>> Cc: linux-nfs and others. Thanks!
> >>>>
> >>>>
> >>>> On 1/21/25 5:39 AM, Amir Goldstein wrote:
> >>>>> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> >>>>> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> >>>>> error server warning.
> >>>>
> >>>>> The same reason that casued the reported EBUSY on rmdir() (dir is a
> >>>>> local mount point in some other bind mount) could also cause EBUSY on
> >>>>> rename and some filesystems (e.g. FUSE) can return EBUSY on other
> >>>>> operations like open().
> >>>>>
> >>>>> Therefore, to avoid unknown error warning in server, we need to map
> >>>>> EBUSY for all operations.
> >>>>>
> >>>>> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> >>>>> to NFS4ERR_ACCESS in v2/v3 server.
> >>>>>
> >>>>> During the discussion on this issue, Trond claimed that the mapping
> >>>>> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> >>>>> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> >>>>> for directories.
> >>>>
> >>>> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
> >>>> file system objects. Here's what I find in RFC 8881 Section 18.25.4:
> >>>>
> >>>>   > If a file has an outstanding OPEN and this prevents the removal of the
> >>>>   > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
> >>>>
> >>>> It's not normative, but it does suggest that any object that cannot be
> >>>> associated with an OPEN state ID should never cause REMOVE to return
> >>>> NFS4ERR_FILE_OPEN.
> >>>>
> >>>>
> >>>>> To keep things simple and consistent and avoid the server warning,
> >>>>> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
> >>>>
> >>>> Generally a "one size fits all" mapping for these status codes is
> >>>> not going to cut it. That's why we have nfsd3_map_status() and
> >>>> nfsd_map_status() -- the set of permitted status codes for each
> >>>> operation is different for each NFS version.
> >>>>
> >>>> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
> >>>>
> >>>> NFSv4 has only REMOVE, and it removes the directory entry for the
> >>>> object no matter its type. The set of failure modes is different for
> >>>> this operation compared to NFSv3 REMOVE.
> >>>>
> >>>> Adding a specific mapping for -EBUSY in nfserrno() is going to have
> >>>> unintended consequences for any VFS call NFSD might make that returns
> >>>> -EBUSY.
> >>>>
> >>>> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
> >>>> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
> >>>> trigger a warning.
> >>>>
> >>>>
> >>>
> >>> Sorry, I didn't understand what you are suggesting.
>
> I'm saying that we need to consider the errno -> NFS status code
> mapping on a case-by-case basis first.
>
>
> >>> FUSE can return EBUSY for open().
> >>> What do you suggest to do when nfsd encounters EBUSY on open()?
> >>>
> >>> vfs_rename() can return EBUSY.
> >>> What do you suggest to do when nfsd v3 encounters EBUSY on rename()?
>
> I totally agree that we do not want NFSD to leak -EBUSY to NFS clients.
>
> But we do need to examine all the ways -EBUSY can leak through to the
> NFS protocol layers (nfs?proc.c). The mapping is not going to be the
> same for every NFS operation in every NFS version. (or, at least we
> need to examine these cases closely and decide that nfserr_access is
> the closest we can get for /every/ case).
>
>
> >>> This sort of assertion:
> >>>          WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
> >>>
> >>> Is a code assertion for a situation that should not be possible in the
> >>> code and certainly not possible to trigger by userspace.
> >>>
> >>> Both cases above could trigger the warning from userspace.
> >>> If you want to leave the warning it should not be a WARN_ONCE()
> >>> assertion, but I must say that I did not understand the explanation
> >>> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().
> >>
> >> My answer to this last question is that it isn't obvious that EBUSY
> >> should map to NFS4ERR_ACCESS.
> >> I would rather that nfsd explicitly checked the error from unlink/rmdir and
> >> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
> >> comment (like we have now) explaining why it is best.
> >
> > Can you please suggest the text for this comment because I did not
> > understand the reasoning for the error.
> > All I argued for is conformity to NFSv2/3, but you are the one who chose
> > NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
> > reasoning for this error code. All I have is:
> > "For NFSv3, the best we can do is probably NFS3ERR_ACCES,
> >    which isn't true, but is not less true than the other options."
>
> You're proposing to change the behavior of NFSv4 to match NFSv2/3, and
> that's where we might need to take a moment. The NFSv4 protocol provides
> a richer set of status codes to report this situation.
>
>

To be fair, I did not propose that in patch v1:
https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@xxxxxxxxx/

I proposed to keep the EBUSY -> NFS4ERR_FILE_OPEN mapping for v4
and extend the operations that it applies to.
Trond had reservations about his mapping.
I have no problem with going back to v1 mapping and reducing the
mapped operations to rmdir/unlink/rename/open or any other mapping
that you prefer.

> >> And nfsd should explicitly check the error from open() and map EBUSY to
> >> whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
> >> is, the reason is likely different to the reason that it is best for
> >> rmdir.
> >> So again, I would like a comment in the code explaining the choice with
> >> a reference to FUSE.
> >
> > My specific FUSE filesystem can return EBUSY for open(), but FUSE
> > filesystem in general can return EBUSY for any operation if that is what
> > the userspace server returns.
>
> Fair, that suggests that eventually we might need the general nfserrno
> mapping in addition to some individual checking in NFS operation- and
> version-specific code. I'm not ready to leap to that conclusion yet.
>
>

I am fine with handling EBUSY in unlink/rmdir/rename/open
only for now if that is what everyone prefers.

> >> Then if some other function that we haven't thought about starts
> >> returning EBUSY, we'll get warning and have a change to think about it.
> >>
> >
> > I have no objection to that, but I think that the WARN_ONCE should be
> > converted to pr_warn_once() or pr_warn_ratelimited() because userspace
> > should not be able to trigger a WARN_ON in any case.
>
> It isn't user space that's the concern -- it's that NFS clients can
> trigger this warning. If a client accidentally or maliciously triggers
> it repeatedly, it can fill the NFS server's system journal.
>
> Our general policy is that we use the _ONCE variants to prevent a remote
> attack from overflowing the server's root partition.
>
>

This is what Documentation/process/coding-style.rst has to say:

Do not WARN lightly
*******************

WARN*() is intended for unexpected, this-should-never-happen situations.
WARN*() macros are not to be used for anything that is expected to happen
during normal operation. These are not pre- or post-condition asserts, for
example. Again: WARN*() must not be used for a condition that is expected
to trigger easily, for example, by user space actions. pr_warn_once() is a
possible alternative, if you need to notify the user of a problem.

---

But it's not even that - I find that syzbot and other testers treat any WARN_ON
as a bug (as they should according to coding style).
This WARN_ON in nfsd is really easy to trigger from userspace and from
malicious nfs client.
If you do not replace this WARN_ON, I anticipate that the day will come when
protocol fuzzers will start bombing you with bug reports.

If it is "possible" to hit this assertion - it should not be an assertion.

> > I realize the great value of the stack trace that WARN_ON provides in
> > this scenario, but if we include in the warning the operation id and the
> > filesystem sid I think that would be enough information to understand
> > where the unmapped error is coming from.
>
> Hm. The stack trace tells us all that without having to add the extra
> (rarely used) arguments to nfserrno. I'm OK with a stack trace here
> because this is information for developers, who can make sense of it.
> It's not a message that a system admin or user needs to understand.
>
>

It's your call. If you are not bothered by the bug reports.

Thanks,
Amir.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux