NFS server misbehaving (nfsd eats CPU and returns no data)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello.
I'm having a problem with my NFS share. It's been present for some time
now, kernel versions got upgraded, setup has been changed, but since I
can't remember for sure when it started, I'll just describe my current
configuration. But first the problem itself.
Periodically (I would say every GB of data or so) reads from NFS share
hang during which nfsd kernel thread on server eats CPU but no data
gets sent to the client. After a minute everything comes to norm again
(no action on my part is required). First thing I did to debug this is
I enabled verbose output in all userspace daemons both on client and
server - it produced no output whatsoever during the problematic period
of time. Next I dumped network traffic on TCP port 2049 both on client
and server - there was no packet drops or any other strange stuff,
except that client restarted the TCP connection to 2049 port after a
minute of silence from server (which resulted in data flowing again).
This was confirmed by kernel debug output from client (echo 65535 | tee
nfs_debug nfsd_debug nlm_debug rpc_debug) - NFS client sent a server
READ request with 60 seconds timeout, timeout was reached and resulted
in dropping and restarting of NFS TCP connection. So this points to NFS
server kernel code. Kernel debug output on server is quite large and
spikes during the hangs - I've attached deduplicated (by hand) version
of it to this email. I couldn't find anything strange in there, but I
don't understand most of it anyway.
My current setup - both server and client are Linux 3.3.3, NFSv4 with
sec=krb5, it runs through local network 192.168.0.0/24 with no
firewalls (client has iptables disabled in kernel, server ACCEPTs
everything from internal interface). Client uses Wi-Fi, server -
Ethernet with VLANs, so traffic goes through AP. But since network dumps
on server and client are the same, network configuration IMHO is
irrelevant, I just added it for fullness of description. Most common
usage (and test case) of this NFS share is watching some videos using
mplayer. Underlying filesystem is XFS (though ext4 is used for other
shares on the same server).
I'm ready to provide additional information or test some patches, since
this problem is quite annoying (and IMHO got worse with time).

Attachment: nfs_debug.syslog
Description: Binary data


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux