Pablo Neira Ayuso schreef op 4/05/2014 1:39:
I think we still may see IP packets larger than the mtu in that path. It would be a rare case since we need that the bridge has different (smaller) mtu than the sender, but still possible. The is_skb_forwardable() check in the current tree snapshot comes just a bit later, so if we remove that skb->nfct, the bridge will fragment large packets. In general, I believe bridges should silently drop packets that are larger than the mtu and they should perform no fragmentation handling, no gathering and no [re]fragmentation. They are transparent devices that operate at layer 2.
I agree. I don't think it's a good idea to commit code that would do fragmentation of IP packets that weren't defragmented first.
The conntrack case is a special case that forces us to enable fragmentation handling since we get sort of a bridge that inspects layer 3 and 4 packet information. So we have sort of, let's call it, a mutant bridge. We also have the tproxy target and the socket match, they seem to require defragmentation as well, I'm afraid the skb->nfct check will not help for those cases. I think that we need some counter to know how many clients we have that require the gathering + fragmentation code, so if we have at least one, we have to enable it.
If I understood Vasily correctly, in his setup ip_defrag is being called from code that isn't connection tracking. Glancing at the code, at least IP virtual server and the code that handles the router attention IP option also call ip_defrag.
Isn't there an easy way to see that the skb contains a defragmented IP packet? If there were, then it seems replacing the "skb->nfct != NULL" by "is_defragmented(skb)" would suffice, no? I see no reason to artificially restrict defrag/refrag to connection tracking.
cheers, Bart -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html