Re: Fwd: nfs v4.2 leaking file descriptors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 15, 2019 at 06:00:56PM +0100, Bruno Santos wrote:
> We have a debian stretch HPC cluster(#1 SMP Debian 4.9.130-2
> (2018-10-27)). One of the machines mounts a couple of drives from a
> Dell compellent system and shares it across a 10GB network to 4
> different machines.
> 
> We had the nfs server crashing a few weeks ago because the file-max
> limit had been reached. At the time we increased the number of file
> handles it can handle and been monitoring since. We have noticed that
> the number of entries on that machine keeps increasing though and
> despite our best efforts we have been unable identify the cause.
> Anything I can find related to this is from a well known bug in 2011
> and nothing afterwards. We are assuming this is caused but a leak of
> file handles on the nfs side but not sure.
> 
> Does anyone has anyway of figuring out what is causing this? Output
> from the file-ne, lsof, etc is below.

Off the top of my head, the only idea I have is to try watching

	grep nfsd4 /proc/slabinfo

and see if any of those objects are also leaking.

--b.

> 
> Thank you very much for any help you can provide.
> 
> Best regards,
> Bruno Santos
> 
> :~# while :;do echo "$(date): $(cat /proc/sys/fs/file-nr)";sleep
> 30;done
> Mon 15 Apr 17:23:11 BST 2019: 2466176   0       4927726
> Mon 15 Apr 17:23:41 BST 2019: 2466176   0       4927726
> Mon 15 Apr 17:24:11 BST 2019: 2466336   0       4927726
> Mon 15 Apr 17:24:41 BST 2019: 2466240   0       4927726
> Mon 15 Apr 17:25:11 BST 2019: 2466560   0       4927726
> Mon 15 Apr 17:25:41 BST 2019: 2466336   0       4927726
> Mon 15 Apr 17:26:11 BST 2019: 2466400   0       4927726
> Mon 15 Apr 17:26:41 BST 2019: 2466432   0       4927726
> Mon 15 Apr 17:27:11 BST 2019: 2466688   0       4927726
> Mon 15 Apr 17:27:41 BST 2019: 2466624   0       4927726
> Mon 15 Apr 17:28:11 BST 2019: 2466784   0       4927726
> Mon 15 Apr 17:28:41 BST 2019: 2466688   0       4927726
> Mon 15 Apr 17:29:11 BST 2019: 2466816   0       4927726
> Mon 15 Apr 17:29:42 BST 2019: 2466752   0       4927726
> Mon 15 Apr 17:30:12 BST 2019: 2467072   0       4927726
> Mon 15 Apr 17:30:42 BST 2019: 2466880   0       4927726
> 
> ~# lsof|wc -l
> 3428



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux