> On Jul 29, 2022, at 1:46 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > >> On Jul 29, 2022, at 1:34 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >> >> On Fri, 2022-07-29 at 17:21 +0000, Chuck Lever III wrote: >>> >>>> On Jul 29, 2022, at 12:47 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: >>>> >>>> We had a report from the spring Bake-a-thon of data corruption in some >>>> nfstest_interop tests. Looking at the traces showed the NFS server >>>> allowing a v3 WRITE to proceed while a read delegation was still >>>> outstanding. >>>> >>>> Currently, we only set NFSD_FILE_BREAK_* flags if >>>> NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc. >>>> NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for >>>> COMMIT ops, where we need a writeable filehandle but don't need to >>>> break read leases. >>>> >>>> It doesn't make any sense to consult that flag when allocating a file >>>> since the file may be used on subsequent calls where we do want to break >>>> the lease (and the usage of it here seems to be reverse from what it >>>> should be anyway). >>>> >>>> Also, after calling nfsd_open_break_lease, we don't want to clear the >>>> BREAK_* bits. A lease could end up being set on it later (more than >>>> once) and we need to be able to break those leases as well. >>>> >>>> This means that the NFSD_FILE_BREAK_* flags now just mirror >>>> NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just >>>> drop those flags and unconditionally call nfsd_open_break_lease every >>>> time. >>>> >>>> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360 >>>> Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd) >>>> Reported-by: Olga Kornieskaia <kolga@xxxxxxxxxx> >>>> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> >>> >>> I'm going to go out on a limb and predict this will conflict >>> heavily with the filecache overhaul patches I have queued for >>> next. :-) >>> >>> Do you believe this is something that urgently needs to be >>> backported to stable kernels, or can it be rebased on top of >>> the filecache overhaul work? >>> >>> >> >> I based this on top of your for-next branch and I think the filecache is >> already in there. >> >> It's a pretty nasty bug that we probably will want backported, so it >> might make sense to respin this on top of mainline and put it in ahead >> of the filecache overhaul. > > I am a generally a proponent of enabling fix backports. > > I encourage you to test the respin on 5.19 and 5.18 at least > because the moment that patch hits upstream, Sasha and Greg > will pull it into stable. I don't relish the idea of having > to fix the fix, if you catch my drift. > > And perhaps when you repost, the fix should be reordered > before the patches that add the tracepoints. Well... let's rebase the fix, but the tracepoint patches can go in after the filecache overhaul, I think. That should make it a little easier. -- Chuck Lever