Hi All,
I have a nfs server that is under quite a bit of heavy usage from time
to time, and today I have had lots of problems with some clients not
being able to mount the export, and other servers reporting not
responding: kernel: nfs: server someserver not responding, timed out,
when reading/writing.
There are no error logs being produced on the server, and only these
messages below on the client.
I presume this is a load issue, and I would be keen to identify which
component is responsible so I can either upgrade or move some files
elsewhere.
There appears to be a kernel upgrade available in the repos, and some
have suggested setting the kernel option noapic and apic=off for the
next boot, which I am going to try tomorrow)
Any suggestions on how to determine the choking point on this server?
Many Thanks
Tom
# uptime
04:59:17 up 97 days, 19:29, 2 users, load average: 3.73, 3.47, 3.37
# free -m
total used free shared buffers cached
Mem: 12007 11944 63 0 1308 9645
-/+ buffers/cache: 990 11017
Swap: 12095 0 12095
On the clients I am seeing
# mount -vv -t nfs -o soft servername:/mount/processed34 /mount/processed34
mount: trying 198.nn.nn.nn prog 100003 vers 3 prot tcp port 2049
mount: mount to NFS server servername failed: timed out (retrying).
mount: trying 198.nn.nn.nn prog 100003 vers 3 prot tcp port 2049
mount: mount to NFS server 'servername' failed: timed out (giving up).
and for the clients that have mounted the fs;
kernel: nfs: server someserver not responding, timed out
I am seeing errors like this in the /var/log/messages;
# grep BUG messages
Jul 23 03:18:21 servername kernel: BUG: soft lockup - CPU#5 stuck for
11s! [nfsd:20540]
Jul 23 03:22:09 servername kernel: BUG: soft lockup - CPU#1 stuck for
15s! [nfsd:20545]
(repeated many times)
retrans from a client that was having problems;
# nfsstat -c -3
Client rpc stats:
calls retrans authrefrsh
25536750 209 0
Client nfs v3:
null getattr setattr lookup access
readlink
0 0% 1467584 6% 2702 0% 16868358 69% 746134 3%
0 0%
read write create mkdir symlink
mknod
337417 1% 683048 2% 24049 0% 91 0% 0 0%
0 0%
remove rmdir rename link readdir
readdirplus
10923 0% 86 0% 4844 0% 0 0% 158020 0%
3805702 15%
fsstat fsinfo pathconf commit
156 0% 24 0% 0 0% 25625 0%
server nfs stats;
# nfsstat -s -3
Server rpc stats:
calls badcalls badauth badclnt xdrcall
342989209 1397 1397 0 0
Server nfs v3:
null getattr setattr lookup access
readlink
4169 0% 1951757 0% 1241244 0% 68875972 20% 2180568 0%
0 0%
read write create mkdir symlink
mknod
18863 0% 144562881 42% 61482189 17% 456758 0% 0 0%
0 0%
remove rmdir rename link readdir
readdirplus
8119668 2% 35533 0% 1 0% 0 0% 46596 0%
1805329 0%
fsstat fsinfo pathconf commit
545 0% 3379 0% 2 0% 51641657 15%
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html