Re: question about 3sec timeouts with tcp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 31, 2008 at 1:54 PM, Leo <neleo@xxxxxxx> wrote:
> I have also tested it with server and client directly connected (without
>  switch). And even if server and client are the same machine (listen AND
>  connect to 127.0.0.1 on port 3306, interface lo) this problem is
>  reproducable! Therefore it can't be a cable or switch problem.
>
>  But one thing is different when testing with localhost: there is a RST
>  packet!
>
>  tcpdump -i lo
>
>  13:39:55.934575 IP localhost.50171 > localhost.mysql: S
>  285135912:285135912(0) win 32792 <mss 16396,sackOK,timestamp 627887
>  0,nop,wscale 7>
>  13:39:55.934590 IP localhost.mysql > localhost.50171: . ack 103990558
>  win 256 <nop,nop,timestamp 627887 626728>

Hmm... next packet to be acked: 285135912 + 1, server sends ack 103990558
This might be the reason for the reset.

>  13:39:55.934598 IP localhost.50171 > localhost.mysql: R
>  103990558:103990558(0) win 0
>  13:39:58.931261 IP localhost.50171 > localhost.mysql: S
>  285135912:285135912(0) win 32792 <mss 16396,sackOK,timestamp 628187
>  0,nop,wscale 7>
>  13:39:58.931277 IP localhost.mysql > localhost.50171: S
>  334398077:334398077(0) ack 285135913 win 32768 <mss
>  16396,sackOK,timestamp 628187 628187,nop,wscale 7>

Now the server sends ack 285135913 (285135912 + 1), and the connection is OK.

>  13:39:58.931289 IP localhost.50171 > localhost.mysql: . ack 1 win 257
>  <nop,nop,timestamp 628187 628187>
>  13:39:58.931414 IP localhost.50171 > localhost.mysql: F 1:1(0) ack 1 win
>  257 <nop,nop,timestamp 628187 628187>
>  13:39:58.931437 IP localhost.mysql > localhost.50171: F 1:1(0) ack 2 win
>  256 <nop,nop,timestamp 628187 628187>
>  13:39:58.931452 IP localhost.50171 > localhost.mysql: . ack 2 win 257
>  <nop,nop,timestamp 628187 628187>
>
>  Any idea?

Seams to be a issue at server side... What is the kernel version at
server-side? Kernel config?

Br H. Willstrand

>
>  Kind regards,
>  Leo
>
>
>
>
>  H. Willstrand wrote:
>  > On Mon, Mar 31, 2008 at 12:50 PM, Leo <neleo@xxxxxxx> wrote:
>  >
>  >> Here comes the tcpdump (the server has IP 192.168.1.1 listening on port
>  >>  3306 and the client has IP 192.168.1.99).
>  >>
>  >>  tcpdump on the server:
>  >>
>  >>  11:24:44.226373 IP 192.168.1.99.39028 > 192.168.1.1.3306: S
>  >>  3068401272:3068401272(0) win 5840 <mss 1460,sackOK,timestamp 10592352
>  >>  0,nop,wscal e 6>
>  >>  11:24:44.226389 IP 192.168.1.1.3306 > 192.168.1.99.39028: S
>  >>  3052139931:3052139931(0) ack 3068401273 win 5792 <mss
>  >>  1460,sackOK,timestamp 10478 87 10592352,nop,wscale 7>
>  >>  11:24:47.226673 IP 192.168.1.99.39028 > 192.168.1.1.3306: S
>  >>  3068401272:3068401272(0) win 5840 <mss 1460,sackOK,timestamp 10595352
>  >>  0,nop,wscale 6>
>  >>  11:24:47.226715 IP 192.168.1.1.3306 > 192.168.1.99.39028: S
>  >>  3052139931:3052139931(0) ack 3068401273 win 5792 <mss
>  >>  1460,sackOK,timestamp 1048187 10592352,nop,wscale 7>
>  >>  11:24:47.226957 IP 192.168.1.99.39028 > 192.168.1.1.3306: . ack 1 win 92
>  >>  <nop,nop,timestamp 10595352 1048187>
>  >>  11:24:47.227488 IP 192.168.1.99.39028 > 192.168.1.1.3306: F 1:1(0) ack 1
>  >>  win 92 <nop,nop,timestamp 10595352 1048187>
>  >>  11:24:47.227580 IP 192.168.1.1.3306 > 192.168.1.99.39028: F 1:1(0) ack 2
>  >>  win 46 <nop,nop,timestamp 1048187 10595352>
>  >>  11:24:47.227810 IP 192.168.1.99.39028 > 192.168.1.1.3306: . ack 2 win 92
>  >>  <nop,nop,timestamp 10595353 1048187>
>  >>
>  >>  tcpdump on the client:
>  >>
>  >>  11:24:44.045735 IP 192.168.1.99.39028 > 192.168.1.1.3306: S
>  >>  3068401272:3068401272(0) win 5840 <mss 1460,sackOK,timestamp 10592352
>  >>  0,nop,wscale 6>
>  >>  11:24:47.045710 IP 192.168.1.99.39028 > 192.168.1.1.3306: S
>  >>  3068401272:3068401272(0) win 5840 <mss 1460,sackOK,timestamp 10595352
>  >>  0,nop,wscale 6>
>  >>  11:24:47.045911 IP 192.168.1.1.3306 > 192.168.1.99.39028: S
>  >>  3052139931:3052139931(0) ack 3068401273 win 5792 <mss
>  >>  1460,sackOK,timestamp 1048187 10592352,nop,wscale 7>
>  >>  11:24:47.046015 IP 192.168.1.99.39028 > 192.168.1.1.3306: . ack 1 win 92
>  >>  <nop,nop,timestamp 10595352 1048187>
>  >>  11:24:47.046541 IP 192.168.1.99.39028 > 192.168.1.1.3306: F 1:1(0) ack 1
>  >>  win 92 <nop,nop,timestamp 10595352 1048187>
>  >>  11:24:47.046845 IP 192.168.1.1.3306 > 192.168.1.99.39028: F 1:1(0) ack 2
>  >>  win 46 <nop,nop,timestamp 1048187 10595352>
>  >>  11:24:47.046880 IP 192.168.1.99.39028 > 192.168.1.1.3306: . ack 2 win 92
>  >>  <nop,nop,timestamp 10595353 1048187>
>  >>
>  >>  I can also provide a strace if required (alternatively you can find the
>  >>  tcpdump and the strace in my original posting, e.g.
>  >>  http://marc.info/?l=linux-net&m=120214156112338&w=2).
>  >>
>  >>  Kind regards,
>  >>  Leo
>  >>
>  >>
>  >
>  > Well, the server sends back the SYN/ACK. Your client is not receiving it...
>  > Normally I will say a network problem (switch / router), please check
>  > that once more.
>  >
>  > (There is / was a bug in PCAP before which lost packages in tcpdump,
>  > but this seams not be the case here)
>  >
>  > Br H.Willstrand
>  >
>  >
>  >
>  >>
>  >>  H. Willstrand wrote:
>  >>  > On Mon, Mar 31, 2008 at 11:33 AM, Leo <neleo@xxxxxxx> wrote:
>  >>  >
>  >>  >> Yes, we have changed the back_log settings in my.cnf. We also tried
>  >>  >>  various combinations of the kernel parameters you mentioned in your
>  >>  >>  email - without success.
>  >>  >>
>  >>  >>  Have you made any changes on the client (=webserver)? Perhaps it's not a
>  >>  >>  problem of the server (=mysql). As I figured out in my original posting
>  >>  >>  it looks as if the client didn't receive the SYN/ACK package from the
>  >>  >>  server (keep in mind that it is reproducable with lo interface,
>  >>  >>  therefore it's not a hardware problem with cables, switches, ...):
>  >>  >>
>  >>  >>  "When you look at the TCP pakets transmitted between server and client
>  >>  >>  (tcpdump) you will see that the client sends a SYN and the server
>  >>  >>  answers with a SYN/ACK. However the client will never receive the
>  >>  >>  SYN/ACK. Therefore the client resends a SYN after exactly 3 seconds. The
>  >>  >>  server answers with a SYN/ACK again and this time the client receives
>  >>  >>  the paket. Now the connection is working."
>  >>  >>
>  >>  >>
>  >>  >
>  >>  > Hi!
>  >>  >
>  >>  > Can you please post that tcpdump? (Just to check the first ACK).
>  >>  >
>  >>  > Br H.Willstrand
>  >>  >
>  >>  >
>  >>  >>  On friday I received an email from another user having the same problem.
>  >>  >>  As I mentioned in my last email: we are not alone ;-)
>  >>  >>
>  >>  >>  Many thanks for your help.
>  >>  >>
>  >>  >>  Kind regards,
>  >>  >>  Leo
>  >>  >>
>  >>  >>
>  >>  >>
>  >>  >>  Marlon de Boer wrote:
>  >>  >>  > Leo wrote:
>  >>  >>  >
>  >>  >>  >> We discovered the same problem two month ago (even 3 sec connect
>  >>  >>  >> problems between webserver and mysql backend). I have posted it to
>  >>  >>  >> the list (subject "TCP connect hangs for 3 seconds", 2008-02-04)
>  >>  >>  >> but didn't get any answer yet. When I read your mail it was a
>  >>  >>  >> flicker of hope.
>  >>  >>  >>
>  >>  >>  >> This is not a mysql issue! You can easily reproduce it with a few
>  >>  >>  >> lines of C code (see below, it's similar to your program) running
>  >>  >>  >> against any open port on the server (e.g. netcat: nc -k -l
>  >>  >>  >> <PORTNUMBER>). We have also tested it on different hardware and
>  >>  >>  >> kernel versions (up to 2.6.24.4) and we can deliver further
>  >>  >>  >> information if requested ...
>  >>  >>  >>
>  >>  >>  >> Unfortunately the kernel parameter you mentioned in your second
>  >>  >>  >> mail (net.core.somaxconn) didn't solve the problem.
>  >>  >>  >>
>  >>  >>  >> I think there are many other people out there having this problem
>  >>  >>  >> unconsciously. Can anybody help?
>  >>  >>  >>
>  >>  >>  >>
>  >>  >>  > I used the c code below to test the server side. After playing with
>  >>  >>  > the kernel settings, some code,
>  >>  >>  > setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)) and
>  >>  >>  > listen(sd, 8192) the timeouts disappeared.
>  >>  >>  >
>  >>  >>  > Did you changed the settings in mysql.cnf? back_log = 8192, and
>  >>  >>  > restarted the mysql daemon? Else try to raise it to 16384 or even higher.
>  >>  >>  >
>  >>  >>  > Regards
>  >>  >>  > Marlon
>  >>  >>  >
>  >>  >>  >
>  >>  >>
>  >>  >>
>  >>  >> --
>  >>  >>  To unsubscribe from this list: send the line "unsubscribe linux-net" in
>  >>  >>  the body of a message to majordomo@xxxxxxxxxxxxxxx
>  >>  >>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >>  >>
>  >>  >>
>  >>  > --
>  >>  > To unsubscribe from this list: send the line "unsubscribe linux-net" in
>  >>  > the body of a message to majordomo@xxxxxxxxxxxxxxx
>  >>  > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >>  >
>  >>  >
>  >>
>  >>
>  >>
>  > --
>  > To unsubscribe from this list: send the line "unsubscribe linux-net" in
>  > the body of a message to majordomo@xxxxxxxxxxxxxxx
>  > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>  >
>  >
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux