On Sun, Nov 21, 2021 at 09:37:53PM -0500, J. Bruce Fields wrote: > On Mon, Nov 22, 2021 at 12:13:08PM +1100, NeilBrown wrote: > > On Mon, 22 Nov 2021, J. Bruce Fields wrote: > > > On Sun, Nov 21, 2021 at 07:56:39PM -0500, J. Bruce Fields wrote: > > > > On Mon, Nov 22, 2021 at 10:50:34AM +1100, NeilBrown wrote: > > > > > On Thu, 18 Nov 2021, J. Bruce Fields wrote: > > > > > > On Wed, Nov 17, 2021 at 11:46:49AM +1100, NeilBrown wrote: > > > > > > > I have a dream of making nfsd threads start and stop dynamically. > > > > > > > > > > > > It's a good dream! > > > > > > > > > > > > I haven't had a chance to look at these at all yet, I just kicked off > > > > > > tests to run overnight, and woke up to the below. > > > > > > > > > > > > This happened on the client, probably the first time it attempted to do > > > > > > an nfsv4 mount, so something went wrong with setup of the callback > > > > > > server. > > > > > > > > > > I cannot reproduce this and cannot see any way it could possible happen. > > > > > > > > Huh. Well, it's possible I mixed up the results somehow. I'll see if I > > > > can reproduce tonight or tomorrow. > > > > > > > > > Could you please confirm the patches were applied on a vanilla 5.1.6-rc1 > > > > > kernel, and that you don't have the "pool_mode" module parameter set. > > > > > > > > /sys/module/sunrpc/parameters/pool_mode is "global", the default. > > > > > > Oh, and yes, this is what I was testing, should just be 5.16-rc1 plus > > > your 14 patches: > > > > > > http://git.linux-nfs.org/?p=bfields/linux-topics.git;a=shortlog;h=659e13af1f8702776704676937932f332265d85e > > OK, tried again and it did indeed reproduce in the same spot. > > > I did find a possible problem. Very first patch. > > in fs/nfsd/nfsctl.c, in _write_ports_addfd() > > if (!err && !nn->nfsd_serv->sv_nrthreads && !xchg(&nn->keep_active, 1)) > > > > should be "err >= 0" rather than "!err". That could result in a > > use-after free, which can make anything explode. > > If not too much trouble, could you just tweek that line and see what > > happens? > > Like the following? Same divide-by-zero, I'm afraid. Hm, playing with reproducer; it takes more than one mount. My simplest reproducer is: mount -overs=3 server:/path /mnt/ umount /mnt/ mount -overs=4.0 server:/path /mnt/ ... and the client crashes here. --b.