Re: question about 3sec timeouts with tcp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/02/2008 10:19:14 PM +0200, Brett Paden <paden@xxxxxxxxxxxx> wrote:
Using Leo's c test on a 2.4.20 kernel, I am __unable__ to create 3000ms
timeouts when doing localhost or interface connections to port 3306
(obviously with a running mysql server).  Same results with my test.

You mean 2.6.20 don't you? 2.6 and 2.4 branches are way too different to do any comparison...

About the tests, we need to focus only on what is relevant to our problem , and always in the same situation. It's very difficult to isolate problems and validate tests if everytime the test protocol changes. If it is consistent, I think you may be able to reproduce the bug with 2 servers, one receiving connections, and the other where we test the different kernels and which is initiating the TCP connections.

*However*, if I run those sames tests against other ports I am able to
generate hangs.

can you describe precisely how you "run" those tests? For each port you test, you need to have a server application listening on it. You could for example change the port MySQL listens to. Is your MySQL server in a production environment? If not, try to reboot to flush any connection table before each test run.

I think it's very important to have precise and thorough test results and protocols, then double check what we post in this thread if we want to have people interested to help and not just thinking this is another bogus thread about mysql config problems (and be ignored!).

Regardless of kernel, it appears that straight up connections to 3306 behave differently than other ports. If, for example, I generate 1000 connections very quickly to port 22 then run a netstat -na, I will see loads of those connections sitting in TIME_WAIT. If I run the identical test against port 3306 and do the same netstat I will see none of those connections sitting in TIME_WAIT. I'm guessing the mysql
does something aggressive with connections to that port and is possibly
unrelated to our problem.  Still, very interesting.

it's perfectly normal to have TIME_WAIT connections in your netstat. MySQL probably set some TCP options like SO_REUSEADDR , this kind of thing, to reuse the sockets instead letting them in TIME_WAIT state (which is only useful on lossy networks).

Anyway, this last part is not relevant with the 3s delay bug.

When the bug happens, you see server-side a lot of half-opened TCP connections in "SYN_RECV" state. You can also capture packets with tcpdump and see results as Leo showed: one SYN packet from client to server, one SYN/ACK packet from server to client (not processed by the client thus not captured in non-promiscuous mode), then 3 seconds later the same packets and a final client ACK, establishing the TCP session.

Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux