On Mon, 16 Mar 2015 11:28:10 -0700 Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > On Mon, Mar 16, 2015 at 11:58:45AM -0400, J. Bruce Fields wrote: > > > 3240 list_add(&stp->st_perstateowner, &oo->oo_owner.so_stateids); > > > 3241 spin_lock(&fp->fi_lock); > > > 3242 list_add(&stp->st_perfile, &fp->fi_stateids); > > > > I assume you're testing only NFS v4.1? > > Exactly. I'm testing with a version of this patch applied to force 4.1: > > http://git.infradead.org/users/hch/pnfs.git/commitdiff/72ef9b95aaed593ac061bb380bc27ced4fd67b4b I've been using 4.2, fwiw... So far, this bug has turned out to be pretty elusive, I've looked all over the so_count handling and I don't see anywhere that we've got a refcount imbalance. I must be missing something, but it all looks right AFAICT. I've also instrumented the code to look for 0->1 transitions on the so_count, and that hasn't fired. I also looked to see whether Bruce's hunch about the nfsd4_find_existing_open thing might be a problem with the sc_count going 0->1 since we're taking a reference there w/o holding the cl_lock. I haven't seen that happen either. Mostly when I see this bug without memory poisoning enabled, it manifests itself as list corruption in one of the stateowner lists during nfsd4_close. Curiously, the attached patch seems to make that problem go away, but the generic/011 test seems to fail most of the time with this in the log: Cannot chdir out of pid directory: Stale file handle ...but if I turn up poisoning of the nfsd4_openowners slab then I get an oops similar to what HCH is seeing. -- Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx>
>From ebb095adcc65f596f75b85f54e32591cda5d5cde Mon Sep 17 00:00:00 2001 From: Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx> Date: Sat, 21 Mar 2015 09:20:53 -0400 Subject: [PATCH] nfsd: poison so_owner.data before freeing it Signed-off-by: Jeff Layton <jeff.layton@xxxxxxxxxxxxxxx> --- fs/nfsd/nfs4state.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index d2f2c37dc2db..7ba3fd19caeb 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -913,6 +913,7 @@ static void nfs4_put_stateowner(struct nfs4_stateowner *sop) return; sop->so_ops->so_unhash(sop); spin_unlock(&clp->cl_lock); + memset(sop->so_owner.data, 0x7c, sop->so_owner.len); kfree(sop->so_owner.data); sop->so_ops->so_free(sop); } -- 2.1.0