> On Mon, Jul 17, 2023 at 9:27 AM Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote: >> >> On 11.07.2023 09:17:37, Ziyang Xuan wrote: >>> Got kmemleak errors with the following ltp can_filter testcase: >>> >>> for ((i=1; i<=100; i++)) >>> do >>> ./can_filter & >>> sleep 0.1 >>> done >>> >>> ============================================================== >>> [<00000000db4a4943>] can_rx_register+0x147/0x360 [can] >>> [<00000000a289549d>] raw_setsockopt+0x5ef/0x853 [can_raw] >>> [<000000006d3d9ebd>] __sys_setsockopt+0x173/0x2c0 >>> [<00000000407dbfec>] __x64_sys_setsockopt+0x61/0x70 >>> [<00000000fd468496>] do_syscall_64+0x33/0x40 >>> [<00000000b7e47d51>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 >>> >>> It's a bug in the concurrent scenario of unregister_netdevice_many() >>> and raw_release() as following: >>> >>> cpu0 cpu1 >>> unregister_netdevice_many(can_dev) >>> unlist_netdevice(can_dev) // dev_get_by_index() return NULL after this >>> net_set_todo(can_dev) >>> raw_release(can_socket) >>> dev = dev_get_by_index(, ro->ifindex); // dev == NULL >>> if (dev) { // receivers in dev_rcv_lists not free because dev is NULL >>> raw_disable_allfilters(, dev, ); >>> dev_put(dev); >>> } >>> ... >>> ro->bound = 0; >>> ... >>> >>> call_netdevice_notifiers(NETDEV_UNREGISTER, ) >>> raw_notify(, NETDEV_UNREGISTER, ) >>> if (ro->bound) // invalid because ro->bound has been set 0 >>> raw_disable_allfilters(, dev, ); // receivers in dev_rcv_lists will never be freed >>> >>> Add a net_device pointer member in struct raw_sock to record bound can_dev, >>> and use rtnl_lock to serialize raw_socket members between raw_bind(), raw_release(), >>> raw_setsockopt() and raw_notify(). Use ro->dev to decide whether to free receivers in >>> dev_rcv_lists. >>> >>> Fixes: 8d0caedb7596 ("can: bcm/raw/isotp: use per module netdevice notifier") >>> Signed-off-by: Ziyang Xuan <william.xuanziyang@xxxxxxxxxx> >>> Reviewed-by: Oliver Hartkopp <socketcan@xxxxxxxxxxxx> >>> Acked-by: Oliver Hartkopp <socketcan@xxxxxxxxxxxx> >> >> Added to linux-can/testing. >> > > This patch causes three syzbot LOCKDEP reports so far. Hello Eric, Is there reproducer? I want to understand the specific root cause. Thanks, William Xuan > > I suspect we need something like the following patch. > > If nobody objects, I will submit this formally soon. > > diff --git a/net/can/raw.c b/net/can/raw.c > index 2302e48829677334f8b2d74a479e5a9cbb5ce03c..ba6b52b1d7767fdd7b57d1b8e5519495340c572c > 100644 > --- a/net/can/raw.c > +++ b/net/can/raw.c > @@ -386,9 +386,9 @@ static int raw_release(struct socket *sock) > list_del(&ro->notifier); > spin_unlock(&raw_notifier_lock); > > + rtnl_lock(); > lock_sock(sk); > > - rtnl_lock(); > /* remove current filters & unregister */ > if (ro->bound) { > if (ro->dev) > @@ -405,12 +405,13 @@ static int raw_release(struct socket *sock) > ro->dev = NULL; > ro->count = 0; > free_percpu(ro->uniq); > - rtnl_unlock(); > > sock_orphan(sk); > sock->sk = NULL; > > release_sock(sk); > + rtnl_unlock(); > + > sock_put(sk); > > return 0; > . >