Re: synack packet invalid when client reconnecting with same src port because out of window?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, 31 Jan 2019, Dominique Martinet wrote:

> I'm having another conntrack blocking packets that I don't quite 
> understand. Jozsef, Florian and Michal have been of great help back in 
> April when I had another problem so counting on you again :-)
> 
> Unlike last time though I don't have a simple reproducer yet, and 
> instead of my personal laptop/servers this is work production 
> environment and I won't be able to test as freely, sorry in advance for 
> that.
> 
> Environment
> ===========
> 
> These are using a quite old rhel7 kernel - 3.10.0-693.11.6.el7.x86_64
> 
> As said previously I do not have a reproducer yet so cannot try
> upgrading; it'll get an update to 3.10.0-862.14.4.el7.x86_64 next week
> but if I have a reproducer I'll try upstream first.
> 
> Please do say if this rings a bell as potentially already fixed, and
> I'll stop wasting everyone's time.
> 
> 
> Overview
> ========
> 
> Basically, when we restart one of our gluster servers the clients will
> try to reconnect to it but something happens that the synack is refused
> by conntrack on maybe one out of four or five clients.
> 
> tcpdump loops on:
> 05:30:41.411346 IP x.y.z.34.49149 > x.y.z.1.24007: Flags [S], seq 837922022, win 26880, options [mss 8960,sackOK,TS val 1689048672 ecr 0,nop,wscale 7], length 0
> 05:30:41.411481 IP x.y.z.1.24007 > x.y.z.34.49149: Flags [S.], seq 1749683989, ack 837922023, win 26844, options [mss 8960,sackOK,TS val 560823762 ecr 1689017605,nop,wscale 7], length 0
> 05:30:57.595860 IP x.y.z.1.24007 > x.y.z.34.49149: Flags [S.], seq 1749683989, ack 837922023, win 26844, options [mss 8960,sackOK,TS val 560839947 ecr 1689017605,nop,wscale 7], length 0
> 
> glusterfs (the client) is silly and always keeps retrying with the same
> source port, so the connection never recovers by itself.
>
> The connection seems to correctly be identified as syn sent:
> # ss -temoi | grep 24007
> SYN-SENT   0      1      x.y.z.34:49149                x.y.z.1:24007                 timer:(on,31sec,6) ino:3170069 sk:ffff88082779be00 <->
> # conntrack -L | grep 24007
> tcp      6 78 SYN_SENT src=x.y.z.34 dst=x.y.z.1 sport=49149 dport=24007 src=x.y.z.1 dst=x.y.z.34 sport=24007 dport=49149 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1
> 
> 
> conntrack -E never lists the connection, but the timestamp is refreshed
> everytime a new syn comes
> 
> For some reason nf_conntrack_log_invalid does not output anything to
> dmesg, but adding log rules before and after the default firewalld's
> INPUT --ctstate INVALID -j DROP rule shows that the synack packets fall
> there (and should have been picked up by the RELATED rule earlier and
> weren't)

We need the SYN/ACK packet as well. Without the packet data there's no way 
to tell why conntrack classified it as INVALID. Packet logging is not 
enough, please capture the whole (broken) traffic between the server and 
the client.

Best regards,
Jozsef
 
> I unfortunately do not have any trace of when the server restarted,
> which would likely help with this. I'm trying to see if I can reproduce
> by forcefully disconnecting the server so the client would try to
> reconnect; if I can do that I'll be able to test anything easily.
> 
> 
> 
> Workarounds/hints
> =================
> 
> - deleting the conntrack entry with conntrack -D --src etc etc makes the
> next syn/synack work.
> - stopping the client for two minutes (so the conntrack entry times out)
> also obviously works for the same reason; the client just repeatedly
> refreshes the rule so it doesn't have a chance to fade.
> 
> - the net.netfilter.nf_conntrack_tcp_be_liberal sysctl also works, so
> that would hint at a window issue? conntrack still expects the previous
> connexion sequences to be used?
> 
> 
> 
> Any help to move forward would be great; I'll try to somehow reproduce
> without disrupting production first but help appreciated!
> 
> 
> Thanks,
> -- 
> Dominique Martinet
> 

-
E-mail  : kadlec@xxxxxxxxxxxxxxxxx, kadlecsik.jozsef@xxxxxxxxxxxxx
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
          H-1525 Budapest 114, POB. 49, Hungary



[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux