On 2025-02-18 at 21:35:42, Maloney, Bryan wrote: > Good point on the POSIX compliance. I'd like to call out that this > behavior of re-opening the file during NFSv4 state recovery is > according to the NFS spec. So this scenario isn't something specific > to just this filesystem. I think it comes down to NFS not being fully > POSIX compliant in all situations. I haven't read the NFS spec, so I can't speak to that, but I suspect it's entirely possible to have the NFS server paper over this problem during state recovery, which is what I would recommend here. That might require an in-kernel NFS server (which Linux has) or some sort of shenanigans under the hood of a userspace server (e.g., temporarily changing the permissions of the file but exposing the existing permissions to clients[0]), but it should be possible to do. I can imagine doing this without a problem in 9P and SFTP (which I have implemented), for instance. In general, I'm loathe to support a file server that's going to spontaneously decide to produce EBADF in the middle of operating on a file for any reason, since that's asking for a bunch of hard-to-fix breakage. That also exposes a huge race condition where we thought we had a valid file descriptor, but it got closed for some reason and then another thread opened a new file and got assigned the same number, and now we're writing to a file we didn't expect. That will very likely end up with repository corruption, which would be really bad. As Peff said, it's possible to work around this particular problem, but I'm concerned we'll find more weird edge cases that will break and that it will lead to data loss for users if we tolerate the NFS server just producing an EBADF at a moment's notice. [0] This is grossly oversimplified and has a lot of edge cases, but I can imagine how I'd go about it. It also depends on how you're storing the files and a lot of other factors. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA
Attachment:
signature.asc
Description: PGP signature