Re: Repeating "unregister_netdevice: waiting for lo to become free" caused by upstream 76da0704507bb ("ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25.04.2018 16:30, Konstantin Khlebnikov wrote:
On 25.04.2018 17:16, Rafał Miłecki wrote:
On 23.04.2018 15:08, Rafał Miłecki wrote:
I've just updated my kernel 4.4.x and noticed a regression. Bisecting
pointed me to the commit 2417da3f4d6bc ("ipv6: only call
ip6_route_dev_notify() once for NETDEV_UNREGISTER") [0] which is
backport of upstream 76da0704507bb. That backported commit has
appeared in a 4.4.103.

I use OpenWrt/LEDE [1] distribution and LXC [2] 1.1.5. After stopping
a container I start getting these messages:
[  229.419188] unregister_netdevice: waiting for lo to become free. Usage count = 1
[  239.660408] unregister_netdevice: waiting for lo to become free. Usage count = 1
[  249.839189] unregister_netdevice: waiting for lo to become free. Usage count = 1
(...)

Trying to start LXC nevertheless results in lxc-start command hang
around network configuration. Trying to query LXC state afterwards
results in a lxc-info command hang too.

I tried Googling for this issue and found similar reports:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729637
https://github.com/fnproject/fn/issues/686
https://lime-technology.com/forums/topic/66863-kernelunregister_netdevice-waiting-for-lo-to-become-free-usage-count-1/
all of them related to the Docker, which is probably a similar use
case to the LXC.

I couldn't find any reference to commit 76da0704507bb that could
suggest fixing the problem I'm seeing.

Does anyone have an idea what is the issue I'm seeing about? Or even
better, how to fix it? Can I provide any additional info that would
help?


[0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.4.y&id=2417da3f4d6bc4fc6c77f613f0e2264090892aa5
[1] https://openwrt.org/
[2] https://linuxcontainers.org/

Today I tried 4.14.34 to see if that helps. Unfortunately it doesn't. I
still experience the same problem.

 From reading various reports regarding that "unregister_netdevice:
waiting for lo to become free" message it appears the problem is caused
by a leaking dst refcnt somewhere in the kernel code.

I found links to few commit fixing leaks at various places:
4a31a6b19f9dd ("sctp: fix dst refcnt leak in sctp_v4_get_dst")
957d761cf91cd ("sctp: fix dst refcnt leak in sctp_v6_get_dst()")
4ee806d51176b ("net: tcp: close sock if net namespace is exiting")
d747a7a51b009 ("tcp: reset sk_rx_dst in tcp_disconnect()")
751eb6b6042a5 ("ipv6: addrconf: fix dev refcont leak when DAD failed")

All above patches are present in the linux-v4.4.y and are part of kernel
4.4.124 I use. So it seems I'm facing yet another dst refcnt leak.

Could commit 2417da3f4d6bc ("ipv6: only call ip6_route_dev_notify() once
for NETDEV_UNREGISTER") introduce a new dst refcnt leak? Or does it only
expost existing one?

Mathias Tillman reported this as "4.4.103 linux kernel regression".
Last message in that thread (which I couldn't find in mailing list archives) had:
| As it turns out, it's due to a patch in the Turris Omnia/OpenWRT code that adds a in6_dev_get call without calling in6_dev_put.

Wow, this is very helpful, thank you!

Somehow I didn't even think about OpenWrt downstream patches. Too bad
this wasn't reported to the OpenWrt community, I spent 2 days on this.
There is indeed:
target/linux/generic/patches-4.4/670-ipv6-allow-rejecting-with-source-address-failed-policy.patch
[PATCH 1/2] ipv6: allow rejecting with "source address failed policy"

I'll move this issue discussion to the OpenWrt/LEDE now, I hope we can
sort it out.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux