Re: Linux to Netapp -> Is UDP over WAN safe as long as I use "sync, hard and intr" ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chuck,

Thank you very much for the advice. I'm currently using wsize and rsize of 1024 to avoid IP fragmentation. Strangely, UDP performs much faster compared to TCP, no matter the size of rsize/wsize. I am puzzled about that, but ...

The problem with the network troubleshoot is that I cannot change anything in the path between the client and the server : 2 routers and a CISCO PIX (I don't know if it's one of this 2 hops or it's invisible, but it's there for sure). While researching this problem I've noticed that iperf shows extremely slow speed from client to server LAN segments, while in the opposite way the speed is ok. There's definitely something wrong there but I cannot change it neither complain, so, let UDP be it, I'm satisfied enough with the speed I get with rsize/wsize=1024. My main concern was about data corruption.

Thanks again.

All the best,
David

On 29/04/2011, at 17:38, Chuck Lever wrote:


On Apr 29, 2011, at 11:07 AM, David McGiven wrote:

Dear All,

I'm having problems with Linux NFS clients accessing a NetApp NFS server. The problems are mostly because it's a WAN connection (with 3 hops in between). There's a mixture of poor WAN performance and Ubuntu kernel bugs regarding NFS locks. I've been struggling with this for too long. I then tried UDP instead of TCP and all the problem seem to have vanished :

I get better performance.
I don't get lock errors and stalled connections in the kernel.
I don't get nfs server xxx.xxx.xx not responding any more.

So I guess I will use UDP, no matter if TCP is recommended in terms of performance. Also, I can't control the WAN routers and switches so I'm tied up to that.

My concerns are, is it OK to use UDP over WAN regarding data corruption ? I've read that UDP over WAN can cause it, and I'm a little bit afraid, although I don't know why would it corrupt data if it's "sync,hard,intr" as the mount options.

The main source of data corruption in your case would be IP reassembly problems. The IP ID field is just 16 bits. If a UDP packet is large, it is spread across many IP packets, and the receiving end can screw up packet reassembly if the ID field wraps.

You can mitigate this risk by capping the size of read and write requests. Assuming an end-to-end MTU of 1536 octets, using rsize=wsize=1024 would eliminate the possibility of packet mis- assembly on reads and writes. However, performance might suffer. A somewhat larger transfer size might perform better, with acceptably small risk of data corruption.

Though I must say, it is quite rare that TCP shows this kind of misbehavior while UDP does not. I think it would be worth some trouble to root-cause the networking issues here. It may be as simple as incorrect firewall settings. Have you looked at packet traces?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux