Re: nf_nat_icmp_reply_translation dropped icmp redirect packet

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



sun miller <sunminlei@xxxxxxxxx> wrote:
> I've noticed an issue during my testing, and I believe I've identified
> the root cause in the code. However, I'm not sure if it's a bug or a
> deliberate feature.

deliberate.

> Here's the testing process:
> 
> 1. When a request (src: A, dst: B) is sent to the machine (R) for some
> reason. R has ip_forward enabled (echo 1 >
> /proc/sys/net/ipv4/ip_forward).   // A, B, and R are in the same
> subnet.
> 
> 2. When there's no firewall enabled, R sends an ICMP redirect packet to A.
> 
> 3. When the firewall is enabled, R doesn't send an ICMP redirect
> packet to A   // tcpdump -i any icmp -nn shows no packet
> 
> I've traced the code path as follows:
> 
> 1. ip_rcv --> nf_hook_slow (PREROUTING) --> ... -->
> nf_nat_alloc_null_binding --> nf_nat_setup_info // ct->status = 256
> 2. ip_rcv_finish --> ip_forward --> ip_rt_send_redirect --> icmp_send
> --> icmp_push_reply --> ... --> nf_conntrack_attach // nskb ct->status
> = 256
> 3. then,  icmp_push_reply --> ip_push_pending_frames --> ... -->
> iptable_nat_ipv4_local_fn (OUTPUT) --> nf_nat_ipv4_fn -->
> nf_nat_icmp_reply_translation
> 
> Here's the relevant code:
> 
> 
> int nf_nat_icmp_reply_translation(...)
> {
>     // ...
>     inside = (void *)skb->data + hdrlen;
>     if (inside->icmp.type == ICMP_REDIRECT) {
>         // ct->status is 256, but IPS_NAT_DONE_MASK is 384
>         if ((ct->status & IPS_NAT_DONE_MASK) != IPS_NAT_DONE_MASK)
>             return 0;
>     // ...
> }
> 
> unsigned int nf_nat_ipv4_fn(...)
> {
>     // ...
>     switch (ctinfo) {
>     case IP_CT_RELATED:
>     case IP_CT_RELATED_REPLY:
>         if (ip_hdr(skb)->protocol == IPPROTO_ICMP) {
>             if (!nf_nat_icmp_reply_translation(skb, ct, ctinfo, ops->hooknum))
>                 return NF_DROP;   //  <------- DROP
>             else
>                 return NF_ACCEPT;
>         }
>     // ...
> }
> 
> 
> Because PREOUTING and OUTPUT have the same maniptype
> (NF_NAT_MANIP_DST), ct->status can't equal IPS_NAT_DONE_MASK.

Yes.

> I've tested this on multiple kernel versions, including 3.10, 5.15,
> and 6.5, and I can reproduce the issue.
> 
> Currently, I'm unsure if this is a bug or an intentional feature. If
> it's a feature, where can I find reference documentation that explains
> this behavior?

For some reason the comment that explains this got dropped when
code was moved around.  The original comment was:

/* Redirects on non-null nats must be dropped, else they'll
   start talking to each other without our translation, and be
   confused... --RR */

... which is exactly what this does, it drops the redirect
if it can't verify that no nat is in place in either direction.



[Index of Archives]     [Linux Netfilter Development]     [Linux Kernel Networking Development]     [Netem]     [Berkeley Packet Filter]     [Linux Kernel Development]     [Advanced Routing & Traffice Control]     [Bugtraq]

  Powered by Linux