Re: GRE-NAT broken

Linux Advanced Routing and Traffic Control

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/24/2018 12:54 PM, Matthias Walther wrote:
Hello,

Hi,

I used to nat GRE-tunnels into a kvm machine. That used to work perfectly, till it stopped working in early January.

Okay.  :-/

Can I get a high level overview of your network topology? You've mentioned bridges, eth0, and VMs. - I figure asking is better than speculating.

I'm not really sure, what caused this malfunction. I tried different kernel versions, 4.4.113, 4.10.0-35, 4.10.0-37, 4.14. All on ubuntu 16.04.3.

Do you know specifically when things stopped working as desired? Have you tried the kernel that you were running before that? Are you aware of anything that changed on the system about that time? I.e. updates? Kernel versions?

Normal destination based nat rules, like ssh tcp 22 e. g., work perfectly. That gre nat rule is in place:

-A PREROUTING -i eth0 -p gre -j DNAT --to-destination 192.168.10.62

And the needed kernel modules are loaded:

root# lsmod|grep gre
61:nf_conntrack_proto_gre    16384  0
62:nf_nat_proto_gre       16384  0
63:nf_nat 24576 4 nf_nat_proto_gre,nf_nat_ipv4,xt_nat,nf_nat_masquerade_ipv4 64:nf_conntrack 106496 6 nf_conntrack_proto_gre,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4

Still some packes are just not correctly natted. The configuration should be correct, as it used to work like this.

Please provide a high level packet flow as you think that it should be. I.e. GRE encaped comes in eth0 … does something … gets DNATed to $IP … goes out somewhere.

One or two tunnels usually work. For the others, the gre packages are just not natted but dropped. First example, which shows the expected behavior:

Are you saying that one or two tunnels at a time work? As if it may be a load / state cache related problem? Or that some specific tunnels seem to work.

Do the tunnels that seem to work do so all the time?

root# tcpdump -ni any host 185.66.195.1 and \( host 176.9.38.150 or host 192.168.10.62 \) and proto 47 and ip[33]=0x01 and \( ip[36:4]==0x644007BA or ip[40:4]==0x644007BA \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 04:06:41.322914 IP 192.168.10.62 > 185.66.195.1: GREv0, length 88: IP 185.66.194.49 > 100.64.7.186: ICMP echo request, id 26639, seq 1, length 64 04:06:41.322922 IP 192.168.10.62 > 185.66.195.1: GREv0, length 88: IP 185.66.194.49 > 100.64.7.186: ICMP echo request, id 26639, seq 1, length 64 04:06:41.322928 IP 176.9.38.150 > 185.66.195.1: GREv0, length 88: IP 185.66.194.49 > 100.64.7.186: ICMP echo request, id 26639, seq 1, length 64 04:06:41.341906 IP 185.66.195.1 > 176.9.38.150: GREv0, length 88: IP 100.64.7.186 > 185.66.194.49: ICMP echo reply, id 26639, seq 1, length 64 04:06:41.341915 IP 185.66.195.1 > 192.168.10.62: GREv0, length 88: IP 100.64.7.186 > 185.66.194.49: ICMP echo reply, id 26639, seq 1, length 64 04:06:41.341918 IP 185.66.195.1 > 192.168.10.62: GREv0, length 88: IP 100.64.7.186 > 185.66.194.49: ICMP echo reply, id 26639, seq 1, length 64

Would you please re-capture, both working and non-working, but specific to one interface? I.e. -i eth0 and -i $outGoingInterface as separate captures? (Or if there is a way to get tcpdump to show the interface in the textual output.)

This^^ works as it should. The packet goes through the bridge interface, then the bridge though which all natted vms are connected, then it is translated and then through the eth0 interface of the hypervisor. And the reply packages follows in reverse direction. The nat works, the address is translated. Not so in the second case:

What type of bridge are you using? Standard Linux bridging, ala brctl and or ip? Or are you using Open vSwitch, or something else?

Can we see a config dump of the bridge?

I wonder if a sysctl (/proc) setting got changed and now IPTables is trying to filter bridged traffic. I think it's /proc/sys/net/bridge/bridge-nf-call-iptables. (At least that's what I'm seeing with a quick Google search.)

Can we see the output of iptables-save?

root@# tcpdump -ni any host 185.66.195.0 and \( host 176.9.38.150 or host 192.168.10.62 \) and proto 47 and ip[33]=0x01 and \( ip[36:4]==0x644007B4 or ip[40:4]==0x644007B4 \)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 03:58:01.972551 IP 192.168.10.62 > 185.66.195.0: GREv0, length 88: IP 185.66.194.49 > 100.64.7.180: ICMP echo request, id 25043, seq 1, length 64 03:58:01.972554 IP 192.168.10.62 > 185.66.195.0: GREv0, length 88: IP 185.66.194.49 > 100.64.7.180: ICMP echo request, id 25043, seq 1, length 64 03:58:03.001013 IP 192.168.10.62 > 185.66.195.0: GREv0, length 88: IP 185.66.194.49 > 100.64.7.180: ICMP echo request, id 25043, seq 2, length 64 03:58:03.001021 IP 192.168.10.62 > 185.66.195.0: GREv0, length 88: IP 185.66.194.49 > 100.64.7.180: ICMP echo request, id 25043, seq 2, length 64

tcpdump catches the outgoing package. But instead of being translated, it's dropped.

We can't tell from the above output if it's traffic coming into the outside interface (eth0?) or traffic leaving the inside interface (connected to the bridge?).

What hypervisor are you using? KVM, VirtualBox, something else? How do the VMs connect to the bridge?

Also, if you're bridging, why are you DNATing packets? - Or is your bridge internal only and you're DNATing between the outside (eth0) and the internal (only) bridge where the VMs are connected?

It sort of looks like you may have a one to one mapping of outside IPs to inside IPs. - Which makes me ask the question why you're DNATing in the first place. Or rather why you aren't bridging the VMs to the outside and running the globally routed IP directly in the VMs.

Any ideas, how I could analyse this? All tested kernels showed the exact same behavior. It's as if only one gre nat connection was possible.

I need more details to be able to start poking further.



--
Grant. . . .
unix || die

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [LARTC Home Page]     [Netfilter]     [Netfilter Development]     [Network Development]     [Bugtraq]     [GCC Help]     [Yosemite News]     [Linux Kernel]     [Fedora Users]
  Powered by Linux