UDP message hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I apologize if this gets through twice. It didn't seem to make it to the
list when I sent it yesterday.

I am working on tracking down a bug for a customer who claimed seeing 30
sec + delays periodically on a UDP application. I had them write a test
case that demonstrated the condition and have verified the behavior.

Essentially, the application uses two ports (9003 and 9005) and sends
messages back and forth using UDP. As of right now, it waits
indefinitely for packets until one is received (I notified them that
they will need to take flow control into account if they are using UDP).
The target for this application is a 200 MHz ARM (atmel at91sam9260). I
have tested with 2.6.20 and 2.6.25.

Here is the pseudo-code (msg_size = 127):
"client"
while(1) {
    sendto(9003, &msg_size, 4bytes);
    sendto(9003, buffer, msg_size);
    recvfrom(9005, &msg_size, 4bytes);
    recvfrom(9005, buffer, msg_size);
}

"server"
while(1) {
    recvfrom(9003, &msg_size, 4bytes);
    recvfrom(9003, buffer, msg_size);
    sendto(9005, &msg_size, 4bytes);
    sendto(9005, buffer, msg_size);
}

When run on a local gigabit hub, things run fine most of the time with
few delays greater than 20 ms. However, sometimes the application will
claim that it waits for the second packet (the data packet) for an
excessive amount of time (greater than 1 second, sometimes up to 20
seconds or more).

Next I ran the application on a cross-over cable and it works for a few
seconds and then hangs indefinitely. If I send 1 ping across the link,
communication starts back up again (the second packet that was sent is
"found" by the server and the recvfrom completes). It hangs again after
a few seconds. If I insert a delay (500 us seems to be sufficient)
between the two send() calls in the server and the client, the issue
disappears.

At first I thought that this was a "rotting packet" case that the NAPI
references where an IRQ is missed on Rx, so I rewrote the poll function
in the macb driver to try to fix this but it didn't help at all. If I
enable debugging in the MACB driver it slows things down enough to make
everything work.

Next, I tested on a Cirrus ep93xx based board (with 2.6.20) and a 133
MHz x86 board (with 2.6.14.7) and noticed the same issue. When run on my
2.6.23 2GHz PC and another similar PC, I haven't seen the problem.

Given the results that I've seen, it appears that the two applications
are getting out of sync with each other some how but I can't see it. The
fact that the packet is actually sent (as verified with Wireshark) but
not realized until more data arrives is puzzling to me. I know that this
is not a typical application for UDP, but I don't see why it would not
work in this environment.

Does anyone have any ideas that could help me debug this further? Am I
overlooking something important?

Thanks in advance,

Travis


--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux