> On Jul 13, 2022, at 3:13 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Wed, 2022-07-13 at 14:15 +0000, Chuck Lever III wrote: >> >> >> Just to be clear, I'm referring to this issue: >> >>>> NOTE: when nfsd4_open() creates a file, the directory does *NOT* remain >>>> locked and never has. So it is possible (though unlikely) for the >>>> newly created file to be renamed before a delegation is handed out, >>>> and that would be bad. This patch does *NOT* fix that, but *DOES* >>>> take the directory lock immediately after creating the file, which >>>> reduces the size of the window and ensure that the lock is held >>>> consistently. More work is needed to guarantee no rename happens >>>> before the delegation. >>> >>> Interesting. Maybe after taking the lock, we could re-vet the dentry vs. >>> the info in the OPEN request? That way, we'd presumably know that the >>> above race didn't occur. > > I usually like to see bugfixes first too, but I haven't heard of anyone > ever hitting this problem in the field. We've lived with this bug for a > long time, so I don't see a problem with cleaning up the locking first > and then fixing this on top of that. > > That said, if we're concerned about it, we could just add an extra check > to nfs4_set_delegation. Basically, take parent mutex, set the > delegation, walk the inode->i_dentry list and vet that one of them > matches the OPEN args. That change probably wouldn't be too invasive and > should be backportable, but it means taking that mutex twice. > > The alternate idea would be to try to set the delegation a lot earlier, > while we still hold the parent mutex for the open. That makes the > cleanup trickier, but is potentially more efficient. I think it'd be > simpler to implement this on top of Neil's patchset since it simplifies > the locking. Backporting that by itself is probably going to be painful > though. > > What should we aim for here? If there is consensus that a fix for a possible delegation/rename race is not necessary in stable kernels, then there is a little more maneuverability -- we can shoot for what is ideal moving forward. Again, my main concerns are not perturbing the legacy code if we don't have to, and having the NFSv4 code spell out its locking requirements clearly. After that, performance is important, so avoiding taking locks and mutexes (mutices?) unnecessarily would be best. -- Chuck Lever