Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > 70e9942f17a6 ("netfilter: nf_conntrack: make event callback registration > per-netns") introduced a per-netns callback for events to workaround a > crash when delivering conntrack events on a stale per-netns nfnetlink > kernel socket. > > This patch adds a new flag to the nf_ct_iter_data object to skip event > delivery from the netns cleanup path to address this issue. > > Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > --- > compiled tested only. > @Florian: Maybe this helps to remove the per-netns nf_conntrack_event_cb > callback without having to update nfnetlink to deal with this corner case? Old crash recipe is (from your changelog of the 'make it pernet' change): 0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. 1) container is started. 2) connect to it via lxc-console. 3) generate some traffic with the container to create some conntrack entries in its table. 4) stop the container: you hit one oops because the conntrack table cleanup tries to report the destroy event to user-space but the per-netns nfnetlink socket has already gone (as the nfnetlink socket is per-netns but event callback registration is global). Pernet exit handlers are called in reverse order of the module load order, so normally this means: ctnetlink exit handlers nfnetlink_net_exit_batch, removes nfnl socket nf_conntrack_pernet_exit(), removes entries, Because callback is pernet atm this prevents crash after nfntlink sk has been closed. If thats no longer the case, we need some other way to suppress calls with stale nfnl sk. With the proposed patch series its still possible that we end up in nfnetlink via the ctnl event handler. E.g. gc worker could evit at the right time, or some kfree_skb call ends up dropping last reference. If you really dislike the nfnl changes I will respin without this and will keep the pernet ctnetlink callback.