sun miller <sunminlei@xxxxxxxxx> wrote: > I've noticed an issue during my testing, and I believe I've identified > the root cause in the code. However, I'm not sure if it's a bug or a > deliberate feature. deliberate. > Here's the testing process: > > 1. When a request (src: A, dst: B) is sent to the machine (R) for some > reason. R has ip_forward enabled (echo 1 > > /proc/sys/net/ipv4/ip_forward). // A, B, and R are in the same > subnet. > > 2. When there's no firewall enabled, R sends an ICMP redirect packet to A. > > 3. When the firewall is enabled, R doesn't send an ICMP redirect > packet to A // tcpdump -i any icmp -nn shows no packet > > I've traced the code path as follows: > > 1. ip_rcv --> nf_hook_slow (PREROUTING) --> ... --> > nf_nat_alloc_null_binding --> nf_nat_setup_info // ct->status = 256 > 2. ip_rcv_finish --> ip_forward --> ip_rt_send_redirect --> icmp_send > --> icmp_push_reply --> ... --> nf_conntrack_attach // nskb ct->status > = 256 > 3. then, icmp_push_reply --> ip_push_pending_frames --> ... --> > iptable_nat_ipv4_local_fn (OUTPUT) --> nf_nat_ipv4_fn --> > nf_nat_icmp_reply_translation > > Here's the relevant code: > > > int nf_nat_icmp_reply_translation(...) > { > // ... > inside = (void *)skb->data + hdrlen; > if (inside->icmp.type == ICMP_REDIRECT) { > // ct->status is 256, but IPS_NAT_DONE_MASK is 384 > if ((ct->status & IPS_NAT_DONE_MASK) != IPS_NAT_DONE_MASK) > return 0; > // ... > } > > unsigned int nf_nat_ipv4_fn(...) > { > // ... > switch (ctinfo) { > case IP_CT_RELATED: > case IP_CT_RELATED_REPLY: > if (ip_hdr(skb)->protocol == IPPROTO_ICMP) { > if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo, ops->hooknum)) > return NF_DROP; // <------- DROP > else > return NF_ACCEPT; > } > // ... > } > > > Because PREOUTING and OUTPUT have the same maniptype > (NF_NAT_MANIP_DST), ct->status can't equal IPS_NAT_DONE_MASK. Yes. > I've tested this on multiple kernel versions, including 3.10, 5.15, > and 6.5, and I can reproduce the issue. > > Currently, I'm unsure if this is a bug or an intentional feature. If > it's a feature, where can I find reference documentation that explains > this behavior? For some reason the comment that explains this got dropped when code was moved around. The original comment was: /* Redirects on non-null nats must be dropped, else they'll start talking to each other without our translation, and be confused... --RR */ ... which is exactly what this does, it drops the redirect if it can't verify that no nat is in place in either direction.