On Wed, Nov 29, 2006 at 12:18:55PM -0500, John Heffner wrote: > >- can someone explain the rationale for Linux's behaviour? > > The EMSGSIZE error is not fatal -- you can ignore it and try again, and > Linux will do the fragmentation for you. Ah, that explains the behaviour I've noticed with ping -sXXXX where XXXX is large. However the send(2) manpage doesn't make this clear. At least, the version I have installed in CentOS 4.4 says: If the message is too long to pass atomically through the underlying protocol, the error EMSGSIZE is returned, and the message is not transmitted. ... EMSGSIZE The socket type requires that message be sent atomically, and the size of the message to be sent made this impossible. It doesn't hint that a retry may succeed. Also, presumably send() returns -1 in this case - how would the application learn what the actual MTU limit encountered was? But now I'm really confused: I've just retested this, sending a large packet rather than a small one, and it sends two fragments immediately with no EMSGSIZE error - see attached code. $ ./testsock2 result: 2048 result: 2048 result: 2048 Now, my original l2tp testing was with a 2.4 kernel (OpenWrt), and perhaps that's the source of confusion. The laptop I'm working on right now is 2.6. > EMSGSIZE is generated when an > ICMP Can't Fragment is received, indicating an MTU change. It's > important that this event get propagated back to the application > somehow, because some applications really want to do MTU discovery, and > this triggers them to change their size, and possibly retransmit some > older packets that are now known to have been lost. Well, the actual behaviour I saw with this particular Linux-based ATA was that SIP packets >1500 bytes sent to it were being blackholed. At least, tcpdump on the wire showed them being delivered as two fragments; it's not clear whether they were being received, but the response was being blackholed because it also was too large; or they weren't received in the first place. The product had a firewall option to block fragments, but it was turned off. Repeated resends were also blackholed. I sent the setsockopt code to them, plus info on how to replicate the issue. They say the problem has been replicated and fixed in new firmware to be released later, but weren't specific as to exactly what they changed. The work I did on l2tp a while ago was to fix a problem using OpenWrt as an l2tp client. I'll need to dig around to find the exact details, but I'm pretty sure I had to make the setsockopt patch for things to work properly. Regards, Brian.
#include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> int main(void) { int s; char buf[2048] = "abc"; int buflen = 2048; struct in_addr t; struct sockaddr_in to; if ((s = socket (PF_INET, SOCK_DGRAM, 0)) < 0) { perror("socket"); exit(1); } to.sin_family = AF_INET; t.s_addr = htonl(0x01020304); memcpy(&to.sin_addr, &t.s_addr, 4); to.sin_port = htons(9999); printf("result: %ld\n", (long) sendto (s, buf, buflen, 0, (struct sockaddr *) &to, sizeof (to))); printf("result: %ld\n", (long) sendto (s, buf, buflen, 0, (struct sockaddr *) &to, sizeof (to))); sleep(1); printf("result: %ld\n", (long) sendto (s, buf, buflen, 0, (struct sockaddr *) &to, sizeof (to))); return 0; }