Re: [RFC] net: ipv4 -- Introduce ifa limit per net

Cyrill Gorcunov <gorcunov@xxxxxxxxx> · Wed, 9 Mar 2016 23:57:47 +0300

On Wed, Mar 09, 2016 at 03:47:25PM -0500, David Miller wrote:
> From: Cyrill Gorcunov <gorcunov@xxxxxxxxx>
> Date: Wed, 9 Mar 2016 23:41:58 +0300
> 
> > On Wed, Mar 09, 2016 at 03:27:30PM -0500, David Miller wrote:
> >> > 
> >> > Yes. I can drop it off for a while and run tests without it,
> >> > then turn it back and try again. Would you like to see such
> >> > numbers?
> >> 
> >> That would be very helpful, yes.
> > 
> > Just sent out. Take a look please. Indeed it sits inside get_next_corpse
> > a lot. And now I think I've to figure out where we can optimize it.
> > Continue tomorrow.
> 
> The problem is that the masquerading code flushes the entire conntrack
> table once for _every_ address removed.
> 
> The code path is:
> 
> masq_device_event()
> 	if (event == NETDEV_DOWN) {
> 		/* Device was downed.  Search entire table for
> 		 * conntracks which were associated with that device,
> 		 * and forget them.
> 		 */
> 		NF_CT_ASSERT(dev->ifindex != 0);
> 
> 		nf_ct_iterate_cleanup(net, device_cmp,
> 				      (void *)(long)dev->ifindex, 0, 0);
> 
> So if you have a million IP addresses, this flush happens a million times
> on inetdev destroy.
> 
> Part of the problem is that we emit NETDEV_DOWN inetdev notifiers per
> address removed, instead of once per inetdev destroy.
> 
> Maybe if we put some boolean state into the inetdev, we could make sure
> we did this flush only once time while inetdev->dead = 1.

Aha! So in your patch __inet_del_ifa bypass first blocking_notifier_call_chain

__inet_del_ifa
	...
	if (in_dev->dead)
		goto no_promotions;

	// First call to NETDEV_DOWN
...
no_promotions:
	rtmsg_ifa(RTM_DELADDR, ifa1, nlh, portid);
	blocking_notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa1);

and here we call for NETDEV_DOWN, which then hits masq_device_event
and go further to conntrack code.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html