> If after Cong's fix, the issue still happens, could you help try the > patch attached and collect all logs when you try the reproduce the > issue? It would be great to have logs for both success case and the > failure case. > > Thanks so much for your help. > I think we have a potential fix for this issue. Martin and I found that when addrconf_dst_alloc() creates a rt6, it is possible that rt6->dst.dev points to loopback device while rt6->rt6i_idev->dev points to a real device. When the real device goes down, the current fib6 clean up code only checks for rt6->dst.dev and assumes rt6->rt6i_idev->dev is the same. That leaves unreleased refcnt on the real device if rt6->dst.dev points to loopback dev. The attached potential fix is tested by Martin and made sure it fixes his issue. John, It will be great if you can also give it a try and see if it fixes the issue on your side before I submit an official patch. Thanks very much for the help from everyone. Wei On Fri, Aug 11, 2017 at 10:25 AM, Wei Wang <weiwan@xxxxxxxxxx> wrote: > On Fri, Aug 11, 2017 at 9:48 AM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote: >> Hi, >> >> On Thu, Aug 10, 2017 at 11:12 AM, John Stultz <john.stultz@xxxxxxxxxx> wrote: >>> On Wed, Aug 9, 2017 at 10:41 PM, Wei Wang <weiwan@xxxxxxxxxx> wrote: >>>> Hi John, >>>> >>>> Is it possible to try the attached patch? >>> >>> Thanks so much for the quick turn around! >>> >>> So I dropped all the reverts you suggested, and applied this one >>> against 4.13-rc4, but I'm still seeing the problematic behavior. >> >> Does the following one-line fix make a difference? >> >> diff --git a/net/ipv6/route.c b/net/ipv6/route.c >> index a640fbcba15d..c145a35763a0 100644 >> --- a/net/ipv6/route.c >> +++ b/net/ipv6/route.c >> @@ -141,7 +141,7 @@ static void rt6_uncached_list_del(struct rt6_info *rt) >> struct uncached_list *ul = rt->rt6i_uncached_list; >> >> spin_lock_bh(&ul->lock); >> - list_del(&rt->rt6i_uncached); >> + list_del_init(&rt->rt6i_uncached); >> spin_unlock_bh(&ul->lock); >> } >> } > > > Thanks a lot Cong for proposing this fix. > > For the last few days, John has been helping me running debug image > and we found out that the leaked dst is probably in addrconf.c. > Martin and I are looking through the code and trying to put more debugs. > > John, > > If after Cong's fix, the issue still happens, could you help try the > patch attached and collect all logs when you try the reproduce the > issue? It would be great to have logs for both success case and the > failure case. > > Thanks so much for your help. > > Wei
From 2d8861808c2029013f6b6e86120ba6902329145b Mon Sep 17 00:00:00 2001 From: Wei Wang <weiwan@xxxxxxxxxx> Date: Fri, 11 Aug 2017 16:36:04 -0700 Subject: [PATCH 1/2] potential fix for unregister_netdevice() Change-Id: I5d5f6f7a7ad0f5dd769f33487db17ff2570d52ea --- net/ipv6/route.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 4d30c96a819d..105922903932 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -417,14 +417,12 @@ static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev, struct net_device *loopback_dev = dev_net(dev)->loopback_dev; - if (dev != loopback_dev) { - if (idev && idev->dev == dev) { - struct inet6_dev *loopback_idev = - in6_dev_get(loopback_dev); - if (loopback_idev) { - rt->rt6i_idev = loopback_idev; - in6_dev_put(idev); - } + if (idev && idev->dev != loopback_dev) { + struct inet6_dev *loopback_idev = + in6_dev_get(loopback_dev); + if (loopback_idev) { + rt->rt6i_idev = loopback_idev; + in6_dev_put(idev); } } } @@ -2789,7 +2787,8 @@ static int fib6_ifdown(struct rt6_info *rt, void *arg) const struct arg_dev_net *adn = arg; const struct net_device *dev = adn->dev; - if ((rt->dst.dev == dev || !dev) && + if ((rt->dst.dev == dev || !dev || + rt->rt6i_idev->dev == dev) && rt != adn->net->ipv6.ip6_null_entry && (rt->rt6i_nsiblings == 0 || (dev && netdev_unregistering(dev)) || -- 2.14.0.434.g98096fd7a8-goog