Re: [NFS] NFS performance debuggins

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2008-06-23 at 16:59 +0200, Adrian von Bidder wrote:
> Hi,
> 
> Environment:
> 
> several Debian based clients (Debian etch and etchnhalf kernels, this means 
> 2.6.18 or 2.6.24); Debian etch (2.6.18 kernel) NFS (v3) server.  Network 
> seems basically ok ("ping -f -s 3000" works without losses, ifconfig and 
> switch monitoring shows no errors) with no noticeable load.  Disks seem to 
> have very little load either, NFS server has no other tasks.
> 
> Performance is sluggish :-(  Basically works, though -- no spurious errors.
> 
> tcpdump shows many "reply ERR 1448" etc. msgs whenever NFS activitiy is 
> going on (both stat like with "find /home" or read/write with dd)
> 
> +++
> 16:49:24.778560 IP 10.0.1.2.2049 > 10.0.0.209.809066834: reply ERR 1448
> 16:49:24.790304 IP 10.0.1.2.2049 > 10.0.0.209.943279929: reply ERR 1448
> 16:49:24.801380 IP 10.0.1.2.2049 > 10.0.0.209.2001885801: reply ERR 1448
> 16:49:24.802173 IP 10.0.1.2.2049 > 10.0.0.209.860835666: reply ERR 1448
> 16:49:24.805286 IP 10.0.1.2.2049 > 10.0.0.209.1479697199: reply ERR 1332
> 16:49:24.807679 IP 10.0.1.2.2049 > 10.0.0.209.1096249460: reply ERR 1448
> 16:49:24.808358 IP 10.0.1.2.2049 > 10.0.0.209.2000902760: reply ERR 1332
> 16:49:24.809097 IP 10.0.1.2.2049 > 10.0.0.209.926298420: reply ERR 1448
> 16:49:24.809100 IP 10.0.1.2.2049 > 10.0.0.209.25105411: reply ERR 1332
> 16:49:24.817923 IP 10.0.1.2.2049 > 10.0.0.209.1366504235: reply ERR 1448
> 16:49:24.817927 IP 10.0.1.2.2049 > 10.0.0.209.352525071: reply ERR 1332
> 16:49:24.820397 IP 10.0.1.2.2049 > 10.0.0.209.269848846: reply ERR 1332
> 16:49:24.822097 IP 10.0.1.2.2049 > 10.0.0.209.1345540144: reply ERR 1448
> 16:49:24.822856 IP 10.0.1.2.2049 > 10.0.0.209.944780599: reply ERR 1448
> 16:49:24.825109 IP 10.0.1.2.2049 > 10.0.0.209.1395668559: reply ERR 1448
> 16:49:24.825112 IP 10.0.1.2.2049 > 10.0.0.209.1999335795: reply ERR 1332
> 16:49:24.827813 IP 10.0.1.2.2049 > 10.0.0.209.1685677906: reply ERR 1332
> 16:49:24.829439 IP 10.0.1.2.2049 > 10.0.0.209.1666084982: reply ERR 1448
> 16:49:24.829443 IP 10.0.1.2.2049 > 10.0.0.209.1415656037: reply ERR 1332
> 16:49:24.839013 IP 10.0.1.2.2049 > 10.0.0.209.911226680: reply ERR 1448
> 16:49:24.839017 IP 10.0.1.2.2049 > 10.0.0.209.1735414852: reply ERR 1332
> 16:49:24.841325 IP 10.0.1.2.2049 > 10.0.0.209.911358287: reply ERR 1332
> 16:49:24.842092 IP 10.0.1.2.2049 > 10.0.0.209.1364284211: reply ERR 1448
> 16:49:24.842800 IP 10.0.1.2.2049 > 10.0.0.209.258643250: reply ERR 1332
> 16:49:24.844256 IP 10.0.1.2.2049 > 10.0.0.209.1666017882: reply ERR 1448
> 16:49:24.844996 IP 10.0.1.2.2049 > 10.0.0.209.808595513: reply ERR 1448
> 16:49:24.845674 IP 10.0.1.2.2049 > 10.0.0.209.2000779112: reply ERR 1448
> 16:49:24.845677 IP 10.0.1.2.2049 > 10.0.0.209.1652175121: reply ERR 1332
> 16:49:24.847120 IP 10.0.1.2.2049 > 10.0.0.209.944722769: reply ERR 1448
> 16:49:24.847123 IP 10.0.1.2.2049 > 10.0.0.209.1682657874: reply ERR 1332
> 16:49:24.849334 IP 10.0.1.2.2049 > 10.0.0.209.944714835: reply ERR 1448
> 16:49:24.850873 IP 10.0.1.2.2049 > 10.0.0.209.1345861938: reply ERR 1448
> 16:49:24.918710 IP 10.0.1.2.2049 > 10.0.0.179.1936680564: reply ERR 1448
> 16:49:24.918719 IP 10.0.1.2.2049 > 10.0.0.179.1698508838: reply ERR 1448
> 16:49:24.921911 IP 10.0.1.2.2049 > 10.0.0.179.1633904741: reply ERR 1448
> +++
> 
> Mount options: "rw,noatime,rsize=8192,wsize=8192,intr,hard,addr=10.0.1.2", 
> it seems to pick tcp by default.  I had problems with UDP from some of the 
> clients due to a strangely buggy VDSL switch in the path, so I haven't 
> tried that again (I want to keep the DSL clients and the non-DSL clients 
> identical if this is at all possible, so I can switch equipment around 
> without reconfiguration.)
> 
> That performance is not optimal whith todays desktop environments (tons of 
> small configuration files in both oo.org and kde) at login/program start on 
> cold caches is one thing, but performance
> 
> Now where do I start debugging this?

In the above dump 1448 is _not_ the error code, but rather the packet
length. You might therefore try using the tcpdump option '-vvv' to see
if you can obtain the actual error value (which should tell you why the
NFS server is rejecting your packets).
Alternatively, you might consider using wireshark/tshark, which can
display NFS packets in a much more readable fashion.

Cheers
  Trond


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
NFS maillist  -  NFS@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@xxxxxxxxxxxxxxxxxxxxx is being discontinued.
Please subscribe to linux-nfs@xxxxxxxxxxxxxxx instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux