On Mon, Jul 17, 2023 at 9:27 AM Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote: > > On 11.07.2023 09:17:37, Ziyang Xuan wrote: > > Got kmemleak errors with the following ltp can_filter testcase: > > > > for ((i=1; i<=100; i++)) > > do > > ./can_filter & > > sleep 0.1 > > done > > > > ============================================================== > > [<00000000db4a4943>] can_rx_register+0x147/0x360 [can] > > [<00000000a289549d>] raw_setsockopt+0x5ef/0x853 [can_raw] > > [<000000006d3d9ebd>] __sys_setsockopt+0x173/0x2c0 > > [<00000000407dbfec>] __x64_sys_setsockopt+0x61/0x70 > > [<00000000fd468496>] do_syscall_64+0x33/0x40 > > [<00000000b7e47d51>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 > > > > It's a bug in the concurrent scenario of unregister_netdevice_many() > > and raw_release() as following: > > > > cpu0 cpu1 > > unregister_netdevice_many(can_dev) > > unlist_netdevice(can_dev) // dev_get_by_index() return NULL after this > > net_set_todo(can_dev) > > raw_release(can_socket) > > dev = dev_get_by_index(, ro->ifindex); // dev == NULL > > if (dev) { // receivers in dev_rcv_lists not free because dev is NULL > > raw_disable_allfilters(, dev, ); > > dev_put(dev); > > } > > ... > > ro->bound = 0; > > ... > > > > call_netdevice_notifiers(NETDEV_UNREGISTER, ) > > raw_notify(, NETDEV_UNREGISTER, ) > > if (ro->bound) // invalid because ro->bound has been set 0 > > raw_disable_allfilters(, dev, ); // receivers in dev_rcv_lists will never be freed > > > > Add a net_device pointer member in struct raw_sock to record bound can_dev, > > and use rtnl_lock to serialize raw_socket members between raw_bind(), raw_release(), > > raw_setsockopt() and raw_notify(). Use ro->dev to decide whether to free receivers in > > dev_rcv_lists. > > > > Fixes: 8d0caedb7596 ("can: bcm/raw/isotp: use per module netdevice notifier") > > Signed-off-by: Ziyang Xuan <william.xuanziyang@xxxxxxxxxx> > > Reviewed-by: Oliver Hartkopp <socketcan@xxxxxxxxxxxx> > > Acked-by: Oliver Hartkopp <socketcan@xxxxxxxxxxxx> > > Added to linux-can/testing. > This patch causes three syzbot LOCKDEP reports so far. I suspect we need something like the following patch. If nobody objects, I will submit this formally soon. diff --git a/net/can/raw.c b/net/can/raw.c index 2302e48829677334f8b2d74a479e5a9cbb5ce03c..ba6b52b1d7767fdd7b57d1b8e5519495340c572c 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -386,9 +386,9 @@ static int raw_release(struct socket *sock) list_del(&ro->notifier); spin_unlock(&raw_notifier_lock); + rtnl_lock(); lock_sock(sk); - rtnl_lock(); /* remove current filters & unregister */ if (ro->bound) { if (ro->dev) @@ -405,12 +405,13 @@ static int raw_release(struct socket *sock) ro->dev = NULL; ro->count = 0; free_percpu(ro->uniq); - rtnl_unlock(); sock_orphan(sk); sock->sk = NULL; release_sock(sk); + rtnl_unlock(); + sock_put(sk); return 0;