Re: refcount underflow in nfsd41_destroy_cb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chuck Lever <chuck.lever@xxxxxxxxxx> napisał(a):
> Jan, how are you reproducing this?

It looks like it's taking place on server on high NFS load and about
a day after boot! (as I noticed looking into last -x results, below)
Then system runs all right for a month (to be rebooted on new kernel
[not always] or something like this).

We have some NFS-rooted machines:
/systemd on / type nfs4 (rw,relatime,vers=4.2,rsize=4096,wsize=4096,namlen=255,hard,proto=tcp,
	timeo=10,retrans=2,sec=sys,clientaddr=192.168.1.18,local_lock=none,addr=192.168.1.1)

Server has 10Gb Aquantia AQC107 card connected to Mikrotik CSS326
switch. Clients running distcc (aside from acting as workstations)
are connected on 1Gb ethernet. Server runs Gentoo Linux on OpenRC
(stations have Systemd) with recent gcc-9.3, binutils-2.34 and
glibc-2.30, has 32 GB RAM and AMD Phenom II X6 1090T CPU.

/var/tmp/portage, where compilation takes place, normally is on client
tmpfs, but when there is not enough space to compile huge program, I
switch it to server exported NFS
(/etc/exports opts: -rw,async,no_root_squash,no_subtree_check)

# "grep nfs.*destroy /var/log/messages" mixed with "last -x"

reboot   system boot  5.5.1-gentoo     Mon Feb  3 00:20 - 15:22 (25+15:01)
Feb  4 17:44:39 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	rust compilation, kernel 5.5.1-gentoo

reboot   system boot  5.5.6-gentoo     Fri Feb 28 15:23 - 16:25 (14+01:02)
Feb 29 13:51:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	rust compilation, kernel 5.5.6-gentoo

reboot   system boot  5.5.9-gentoo     Fri Mar 13 16:27 - 00:04 (4+07:36)
Mar 14 18:03:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	libpciaccess compilation, kernel 

reboot   system boot  5.6.0-rc6        Wed Mar 18 00:06 - 20:39 (2+20:32)
Mar 19 11:08:07 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	linux-firmware merge
*
reboot   system boot  5.6.0-rc6        Fri Mar 20 20:40 - 02:40  (05:59)
Mar 20 21:43:34 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	zstd compilation
*
reboot   system boot  5.6.0-rc6        Sat Mar 21 02:42   still running
Mar 21 17:34:43 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	nodejs compilation

* - I noticed kernel fault looking for a reason, why WireGuard refused
to connect with _some_ remote peers so I rebooted the server and it helped.




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux