Tue, Nov 05, 2013 at 07:16:33PM CET, kaber@xxxxxxxxx wrote: >On Tue, Nov 05, 2013 at 04:01:15PM +0100, Jiri Pirko wrote: >> Tue, Nov 05, 2013 at 02:41:19PM CET, kaber@xxxxxxxxx wrote: >> >On Tue, Nov 05, 2013 at 02:32:05PM +0100, Florian Westphal wrote: >> >> Jiri Pirko <jiri@xxxxxxxxxxx> wrote: >> >> > This patch fixes for example following situation: >> >> > On HOSTA do: >> >> > ip6tables -I INPUT -p icmpv6 -j DROP >> >> > ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT >> >> >> >> untested: >> >> >> >> -A INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT >> >> -A INPUT -p icmpv6 -m conntrack --ctstatus CONFIRMED -j ACCEPT >> >> -A INPUT -p icmpv6 -j DROP >> >> >> >> > and on HOSTB you do: >> >> > ping6 HOSTA -s2000 (MTU is 1500) >> >> > >> >> > Incoming echo requests will be filtered out on HOSTA. This issue does >> >> > not occur with smaller packets than MTU (where fragmentation does not happen). >> >> >> >> Patrick, any reason not to kill the special-casing (ct has assigned helper or >> >> unconfirmed conntrack) in __ipv6_conntrack_in() ? >> >> >> >> This should make ipv6 frag behaviour consistent; right now its rather >> >> confusing from ruleset point of view, especially the first packet >> >> of a connection is always seen as reassembled. >> >> >> >> So with Jiris rules >> >> >> >> -A INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT >> >> -A INPUT -p icmpv6 -j DROP >> >> >> >> ping6 -s $bignum works for the first packet but not for subsequent ones >> >> which is quite irritating. >> > >> >Well, the reason was to avoid unnecessary work doing refragmentation >> >unless really required. I know its rather complicated, but IPv6 has >> >always required treating fragments manually or using conntrack state. >> > >> >I'm not objecting to changing this, but the patches as they are are >> >not the way to go. First, moving nfct_frag to struct sk_buff seems >> >> I'm a bit lost. What "nfct_frag" are you reffering to here? > >I meant nfct_reasm of course. The patch is not moving this to struct sk_buff. It is already there. > >> >like a real waste of space for this quite rare case. Also, we can't >> >just use the reassembled packet in ip6tables, when modifying it we >> >will still output the unchanged fragments. An last of all, we'll be >> >executing the rules on the reassembled packet multiple times, one >> >for each fragment. >> >> Reassembled skb would be only used for matching where no changes takes >> place. > >That still doesn't work, our matches are not purely passive. > >> End even though, the matching is now done for each fragment skb anyway. The >> change is only to do it on different skb. I see no erformance or any >> other problem in that. > >Accounting, quota, statistic, limit, ... come to mind. Basically any >match that keeps state. Ok. Makes sense. > >> >So if someone wants to change this, simply *only* pass the reassembled >> >packet through the netfilter hooks and drop the fragments, as in IPv4. >> >> This is unfortunatelly not possible because in forwarding use case, the >> fragments have to be send out as they come in. > >No, the IPv6 NAT patches fixed that, we still do proper refragmentation >and we still respect the original fragment sizes, thus are not responsible >for potentially exceeding the PMTU on the following path. Ok. So the plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c code entirely and use net/ipv6/reassembly.c code directly from nf_defrag_ipv6. This would result in very similar code currently ipv4 has. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html