> On Mar 21, 2020, at 11:43 PM, Jan Psota <jasiu@xxxxxxxxxxxx> wrote: > > Chuck Lever <chuck.lever@xxxxxxxxxx> napisał(a): >> Jan, how are you reproducing this? > > It looks like it's taking place on server on high NFS load and about > a day after boot! (as I noticed looking into last -x results, below) > Then system runs all right for a month (to be rebooted on new kernel > [not always] or something like this). > > We have some NFS-rooted machines: > /systemd on / type nfs4 (rw,relatime,vers=4.2,rsize=4096,wsize=4096,namlen=255,hard,proto=tcp, > timeo=10,retrans=2,sec=sys,clientaddr=192.168.1.18,local_lock=none,addr=192.168.1.1) > > Server has 10Gb Aquantia AQC107 card connected to Mikrotik CSS326 > switch. Clients running distcc (aside from acting as workstations) > are connected on 1Gb ethernet. Server runs Gentoo Linux on OpenRC > (stations have Systemd) with recent gcc-9.3, binutils-2.34 and > glibc-2.30, has 32 GB RAM and AMD Phenom II X6 1090T CPU. > > /var/tmp/portage, where compilation takes place, normally is on client > tmpfs, but when there is not enough space to compile huge program, I > switch it to server exported NFS > (/etc/exports opts: -rw,async,no_root_squash,no_subtree_check) > > # "grep nfs.*destroy /var/log/messages" mixed with "last -x" I thought I read in the initial report that you were seeing this problem only on v5.6-rc6. What is the earliest kernel release where you saw refcount UaF warnings from nfsd4_destroy_cb? > reboot system boot 5.5.1-gentoo Mon Feb 3 00:20 - 15:22 (25+15:01) > Feb 4 17:44:39 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] > rust compilation, kernel 5.5.1-gentoo > > reboot system boot 5.5.6-gentoo Fri Feb 28 15:23 - 16:25 (14+01:02) > Feb 29 13:51:49 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] > rust compilation, kernel 5.5.6-gentoo > > reboot system boot 5.5.9-gentoo Fri Mar 13 16:27 - 00:04 (4+07:36) > Mar 14 18:03:49 agro kernel: nfsd41_destroy_cb+0x2c/0x40 [nfsd] > libpciaccess compilation, kernel > > reboot system boot 5.6.0-rc6 Wed Mar 18 00:06 - 20:39 (2+20:32) > Mar 19 11:08:07 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] > linux-firmware merge > * > reboot system boot 5.6.0-rc6 Fri Mar 20 20:40 - 02:40 (05:59) > Mar 20 21:43:34 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] > zstd compilation > * > reboot system boot 5.6.0-rc6 Sat Mar 21 02:42 still running > Mar 21 17:34:43 agro kernel: nfsd41_destroy_cb+0x36/0x50 [nfsd] > nodejs compilation > > * - I noticed kernel fault looking for a reason, why WireGuard refused > to connect with _some_ remote peers so I rebooted the server and it helped. -- Chuck Lever