On Wed, Jan 22, 2025 at 4:04 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > On 1/22/25 4:05 AM, Amir Goldstein wrote: > > On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@xxxxxxx> wrote: > >> > >> On Wed, 22 Jan 2025, Amir Goldstein wrote: > >>> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >>>> > >>>> Please send patches To: the NFSD reviewers listed in MAINTAINERS and > >>>> Cc: linux-nfs and others. Thanks! > >>>> > >>>> > >>>> On 1/21/25 5:39 AM, Amir Goldstein wrote: > >>>>> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.") > >>>>> mapped EBUSY host error from rmdir/unlink operation to avoid unknown > >>>>> error server warning. > >>>> > >>>>> The same reason that casued the reported EBUSY on rmdir() (dir is a > >>>>> local mount point in some other bind mount) could also cause EBUSY on > >>>>> rename and some filesystems (e.g. FUSE) can return EBUSY on other > >>>>> operations like open(). > >>>>> > >>>>> Therefore, to avoid unknown error warning in server, we need to map > >>>>> EBUSY for all operations. > >>>>> > >>>>> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and > >>>>> to NFS4ERR_ACCESS in v2/v3 server. > >>>>> > >>>>> During the discussion on this issue, Trond claimed that the mapping > >>>>> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the > >>>>> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected > >>>>> for directories. > >>>> > >>>> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of > >>>> file system objects. Here's what I find in RFC 8881 Section 18.25.4: > >>>> > >>>> > If a file has an outstanding OPEN and this prevents the removal of the > >>>> > file's directory entry, the error NFS4ERR_FILE_OPEN is returned. > >>>> > >>>> It's not normative, but it does suggest that any object that cannot be > >>>> associated with an OPEN state ID should never cause REMOVE to return > >>>> NFS4ERR_FILE_OPEN. > >>>> > >>>> > >>>>> To keep things simple and consistent and avoid the server warning, > >>>>> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions. > >>>> > >>>> Generally a "one size fits all" mapping for these status codes is > >>>> not going to cut it. That's why we have nfsd3_map_status() and > >>>> nfsd_map_status() -- the set of permitted status codes for each > >>>> operation is different for each NFS version. > >>>> > >>>> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE. > >>>> > >>>> NFSv4 has only REMOVE, and it removes the directory entry for the > >>>> object no matter its type. The set of failure modes is different for > >>>> this operation compared to NFSv3 REMOVE. > >>>> > >>>> Adding a specific mapping for -EBUSY in nfserrno() is going to have > >>>> unintended consequences for any VFS call NFSD might make that returns > >>>> -EBUSY. > >>>> > >>>> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(), > >>>> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to > >>>> trigger a warning. > >>>> > >>>> > >>> > >>> Sorry, I didn't understand what you are suggesting. > > I'm saying that we need to consider the errno -> NFS status code > mapping on a case-by-case basis first. > > > >>> FUSE can return EBUSY for open(). > >>> What do you suggest to do when nfsd encounters EBUSY on open()? > >>> > >>> vfs_rename() can return EBUSY. > >>> What do you suggest to do when nfsd v3 encounters EBUSY on rename()? > > I totally agree that we do not want NFSD to leak -EBUSY to NFS clients. > > But we do need to examine all the ways -EBUSY can leak through to the > NFS protocol layers (nfs?proc.c). The mapping is not going to be the > same for every NFS operation in every NFS version. (or, at least we > need to examine these cases closely and decide that nfserr_access is > the closest we can get for /every/ case). > > > >>> This sort of assertion: > >>> WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno); > >>> > >>> Is a code assertion for a situation that should not be possible in the > >>> code and certainly not possible to trigger by userspace. > >>> > >>> Both cases above could trigger the warning from userspace. > >>> If you want to leave the warning it should not be a WARN_ONCE() > >>> assertion, but I must say that I did not understand the explanation > >>> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno(). > >> > >> My answer to this last question is that it isn't obvious that EBUSY > >> should map to NFS4ERR_ACCESS. > >> I would rather that nfsd explicitly checked the error from unlink/rmdir and > >> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a > >> comment (like we have now) explaining why it is best. > > > > Can you please suggest the text for this comment because I did not > > understand the reasoning for the error. > > All I argued for is conformity to NFSv2/3, but you are the one who chose > > NFS3ERR_ACCES for v2/3 mapping and I don't know what is the > > reasoning for this error code. All I have is: > > "For NFSv3, the best we can do is probably NFS3ERR_ACCES, > > which isn't true, but is not less true than the other options." > > You're proposing to change the behavior of NFSv4 to match NFSv2/3, and > that's where we might need to take a moment. The NFSv4 protocol provides > a richer set of status codes to report this situation. > > To be fair, I did not propose that in patch v1: https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@xxxxxxxxx/ I proposed to keep the EBUSY -> NFS4ERR_FILE_OPEN mapping for v4 and extend the operations that it applies to. Trond had reservations about his mapping. I have no problem with going back to v1 mapping and reducing the mapped operations to rmdir/unlink/rename/open or any other mapping that you prefer. > >> And nfsd should explicitly check the error from open() and map EBUSY to > >> whatever seems appropriate. Maybe that is also NS4ERR_ACCESS but if it > >> is, the reason is likely different to the reason that it is best for > >> rmdir. > >> So again, I would like a comment in the code explaining the choice with > >> a reference to FUSE. > > > > My specific FUSE filesystem can return EBUSY for open(), but FUSE > > filesystem in general can return EBUSY for any operation if that is what > > the userspace server returns. > > Fair, that suggests that eventually we might need the general nfserrno > mapping in addition to some individual checking in NFS operation- and > version-specific code. I'm not ready to leap to that conclusion yet. > > I am fine with handling EBUSY in unlink/rmdir/rename/open only for now if that is what everyone prefers. > >> Then if some other function that we haven't thought about starts > >> returning EBUSY, we'll get warning and have a change to think about it. > >> > > > > I have no objection to that, but I think that the WARN_ONCE should be > > converted to pr_warn_once() or pr_warn_ratelimited() because userspace > > should not be able to trigger a WARN_ON in any case. > > It isn't user space that's the concern -- it's that NFS clients can > trigger this warning. If a client accidentally or maliciously triggers > it repeatedly, it can fill the NFS server's system journal. > > Our general policy is that we use the _ONCE variants to prevent a remote > attack from overflowing the server's root partition. > > This is what Documentation/process/coding-style.rst has to say: Do not WARN lightly ******************* WARN*() is intended for unexpected, this-should-never-happen situations. WARN*() macros are not to be used for anything that is expected to happen during normal operation. These are not pre- or post-condition asserts, for example. Again: WARN*() must not be used for a condition that is expected to trigger easily, for example, by user space actions. pr_warn_once() is a possible alternative, if you need to notify the user of a problem. --- But it's not even that - I find that syzbot and other testers treat any WARN_ON as a bug (as they should according to coding style). This WARN_ON in nfsd is really easy to trigger from userspace and from malicious nfs client. If you do not replace this WARN_ON, I anticipate that the day will come when protocol fuzzers will start bombing you with bug reports. If it is "possible" to hit this assertion - it should not be an assertion. > > I realize the great value of the stack trace that WARN_ON provides in > > this scenario, but if we include in the warning the operation id and the > > filesystem sid I think that would be enough information to understand > > where the unmapped error is coming from. > > Hm. The stack trace tells us all that without having to add the extra > (rarely used) arguments to nfserrno. I'm OK with a stack trace here > because this is information for developers, who can make sense of it. > It's not a message that a system admin or user needs to understand. > > It's your call. If you are not bothered by the bug reports. Thanks, Amir.