Re: question about 3sec timeouts with tcp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 1, 2008 at 8:43 PM, Gabriel Barazer <gabriel@xxxxxxxx> wrote:
> On 04/01/2008 8:28:14 PM +0200, "H. Willstrand" <h.willstrand@xxxxxxxxx>
>
>
> wrote:
>  > On Tue, Apr 1, 2008 at 7:59 PM, Gabriel Barazer <gabriel@xxxxxxxx> wrote:
>  >> On 04/01/2008 7:17:31 PM +0200, Leo <neleo@xxxxxxx> wrote:
>  >>  > H. Willstrand wrote:
>  >>  >> On Tue, Apr 1, 2008 at 5:43 PM, Gabriel Barazer <gabriel@xxxxxxxx> wrote:
>  >>  >>
>  >>  >>> On 04/01/2008 4:43:20 PM +0200, Brett Paden <paden@xxxxxxxxxxxx> wrote:
>  >>  >>>  >> If I'm right Brett's problem relays in the test client (provided in
>  >>  >>>  >> the first mail). This has probably to do with the number of ports
>  >>  >>>  >> opened and closed during a short time period.
>  >>  >>>  >
>  >>  >>>  > My test client is designed to simulate the sort of load our
>  >>  >>> production
>  >>  >>>  > databases and web servers see.  We're talking on the order of 100-400
>  >>  >>>  > connections per second.  On an unloaded server the 3000ms occur right
>  >>  >>>  > around 400 connections a second but we have seen them a lower
>  >>  >>> connection
>  >>  >>>  > rates.  Are you suggesting that we could do something simple (like
>  >>  >>> reap
>  >>  >>>  > TIME_WAIT connections) to allevaite the problem?
>  >>  >>>
>  >>  >>>  Using tcp_tw_recycle / tcp_tw_reuse doesn't solve the problem either on
>  >>  >>>  the client nor on the server. I tested with and without these options
>  >>  >>>  enabled, disabled netfilter's connection tracking and none solved this
>  >>  >>>  delay. If even the "lo" interface is concerned, there is definitely
>  >>  >>>  something into the network stack and not the device drivers.
>  >>  >>>
>  >>  >>>  Here is a thread I started on LKML about this very same bug.
>  >>  >>>  http://lkml.org/lkml/2008/3/14/353
>  >>  >>>  There is a forum thread with french hosting providers talking about it.
>  >>  >>>  (if some of you read french:
>  >>  >>>  http://www.webmasterclub.fr/forum/topic,59486,0.html)
>  >>  >>>
>  >>  >>>  We are far from being alone!
>  >>  >>>
>  >>  > Welcome to the club, Gabriel!
>  >>  >>>  Gabriel
>  >>
>  >>  How lucky I am!
>  >>  I suspect there are many other people having this problem out there,
>  >>  they just don't notice these delays on small infrastructures and because
>  >>  this bug doesn't actually cause a connection error, but "only" an
>  >>  unacceptable delay for moderate to high busy servers.
>  >>
>  >>
>  >>  >> Ok, seams to be the same issue that Leo has (has nothing to do with
>  >>  >> the Brett / Marlon issue, only common dominator is the 3000ms).
>  >>  >>
>  >>  > But Gabriel is also talking about 3 second timeouts on the client as
>  >>  > Brett and I did. I have read Gabriel's  description on the provided link
>  >>  > and it seems to be exactly the same problem. I think Brett can confirm
>  >>  > this ...
>  >>  >> This issue is probably caused by server delivering as miscalculated
>  >>  >> SYN/ACK (the acked number is miscalculated, see my second mail).
>  >>  >>
>  >>  > When you look at my first tcpdump with two machines as server and client
>  >>  > then you can see that there are no miscalculated SYN/ACK packets from
>  >>  > the server (and therefore no RST packet from the client). All packets
>  >>  > have the right number but the client never receives the SYN/ACK packet
>  >>  > from the server. Only at the lo test there are RST packets and wrong
>  >>  > packet numbers. But as I told you in my last email I think this is a
>  >>  > different problem and not important for us. We should ignore the lo test
>  >>  > and concentrate on the "real" problem of Brett, Gabriel and myself  (and
>  >>  > even a lot of other people out there).
>  >>
>  >>  I confirm that there is no problem is the sequence numbers. Attached is
>  >>  the pcap compatible capture of the relevant packets (608 bytes, 6
>  >>  packets total: 2 for the failed handshake, 3 for the successful one and
>  >>  1 for the first mysql data packet). This capture has been filtered to
>  >>  show only the relevant packets and done in promiscuous mode.
>  >>
>  >
>
> > I'm missing the tcpdump...
>
>  Sorry, I forgot to include it when reformatting my e-mail. Here it is!
>
>  Gabriel
>

The packages are OK.
Still, how did you produce this situation? Let me guess, you used one
client to mass produce connections to your mysql-server, right?

//HW
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux