On Wed, Jan 20, 2016 at 11:16:43AM +0100, Florian Westphal wrote: > Ulrich reports soft lockup with following (shortened) callchain: > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! > __netif_receive_skb_core+0x6e4/0x774 > process_backlog+0x94/0x160 > net_rx_action+0x88/0x178 > call_do_softirq+0x24/0x3c > do_softirq+0x54/0x6c > __local_bh_enable_ip+0x7c/0xbc > nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] > masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] > atomic_notifier_call_chain+0x1c/0x2c > ipv6_del_addr+0x1bc/0x220 [ipv6] > > Problem is that nf_ct_iterate_cleanup can run for a very long time > since it can be interrupted by softirq processing. > Moreover, atomic_notifier_call_chain runs with rcu readlock held. > > So lets call cond_resched() in nf_ct_iterate_cleanup and defer > the call to a work queue for the atomic_notifier_call_chain case. > > We also need another cond_resched in get_next_corpse, since we > have to deal with iter() always returning false, in that case > get_next_corpse will walk entire conntrack table. Applied, thanks. > Reported-by: Ulrich Weber <uw@xxxxxxxxx> > Tested-by: Ulrich Weber <uw@xxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > --- > I had a look at converting the ipv6 notifier to a blocking one > but I found this too difficult (RTNL held? How to defer notifier calls > from packet path)? Just doing it for masquerade is a lot simpler: > - we only care about NETDEV_DOWN, so no extra work needed in most cases > - can just ignore the notification if too much work is already queued Probably adding a defered notifier chain variant which allows blocking, ie. moving this code to core infrastructure, just an idea. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html