On 30.08.2011 17:27, Florian Westphal wrote: > Patrick McHardy <kaber@xxxxxxxxx> wrote: >> Yes, when using your patch, otherwise (when handling this case in >> nf_nat_setup_info() we might invoke it multiple times simultaneously >> though. >> >>> In case nf_ct_ext_add() we already return NF_ACCEPT, so I think this >>> part is OK. >>> >>>> I also fear this is not >>>> going to be the only problem caused by breaking the "unconfirmed means >>>> non-shared nfct" assumption. >>> >>> Agreed. Perhaps we can solve the module dependeny issue of the "unshare" >>> approach. In fact, if invalid state for the clones would be acceptable >>> then the dependency should go away; AFAICS nf_conntrack_untracked is the >>> only nf-related symbol required by br_netfilter.o not in netfilter/core.c. >> >> I don't think the clones should have invalid state, even untracked is >> very questionable since all packets should have NAT applied to them in >> the same way, connmarks might be used etc. > > Right, but this is probably only going to be fixable in a "try to do the > best without crashing", because even without userspace queueing > there are cases where this is not deterministic: > > -m physdev --physdev-out eth1 -j SNAT ... > -m physdev --physdev-out eth2 -j SNAT ... > > ... will match whatever bridge port the packet will be sent out on > first. Yes, but setting up the rules properly is responsibility of the user. Usually you'd just have a regular NAT rule, in which case you normally want flooded packets to be treated similar. > Also, before 87557c18ac36241b596984589a0889c5c4bf916c > forward ran after pass_frame_up() in which case post_routing is > not involved. > > I am afraid we might first need to find out what should happen in > the "delivered locally and forwarded" case before we can figure > out what a sane fix might look like. I don't really see the problem, the user has to set up his rules properly. >> We probably need to restore the above mentioned assumption somehow. One >> way would be to serialize reinjection of packets belonging to >> unconfirmed conntracks in nf_reinject or the queueing modules. Conntrack >> related stuff doesn't really belong there, but it seems like the easiest >> and safest fix to me. > > Only serializing reinject may not be enough, since some packets might not be > queued (e.g. when queueing only in forward, or only when dealing with > a particular bridge port); in which case we'd still race. True, that case has also always been broken. I don't see a way to properly fix this right now, need to think about it some more. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html