Thu, May 23, 2024 at 09:46:51AM CEST, hengqi@xxxxxxxxxxxxxxxxx wrote: >This reverts commit 4d4ac2ececd3c42a08dd32a6e3a4aaf25f7efe44. > >When the following snippet is run, lockdep will report a deadlock[1]. > > /* Acquire all queues dim_locks */ > for (i = 0; i < vi->max_queue_pairs; i++) > mutex_lock(&vi->rq[i].dim_lock); > >There's no deadlock here because the vq locks are always taken >in the same order, but lockdep can not figure it out, and we >can not make each lock a separate class because there can be more >than MAX_LOCKDEP_SUBCLASSES of vqs. > >However, dropping the lock is harmless: > 1. If dim is enabled, modifications made by dim worker to coalescing > params may cause the user's query results to be dirty data. > 2. In scenarios (a) and (b), a spurious dim worker is scheduled, > but this can be handled correctly: > (a) > 1. dim is on > 2. net_dim call schedules a worker > 3. dim is turning off > 4. The worker checks that dim is off and then exits after > restoring dim's state. > 5. The worker will not be scheduled until the next time dim is on. > > (b) > 1. dim is on > 2. net_dim call schedules a worker > 3. The worker checks that dim is on and keeps going > 4. dim is turning off > 5. The worker successfully configure this parameter to the device. > 6. The worker will not be scheduled until the next time dim is on. > >[1] >======================================================== >WARNING: possible recursive locking detected >6.9.0-rc7+ #319 Not tainted >-------------------------------------------- >ethtool/962 is trying to acquire lock: > >but task is already holding lock: > >other info that might help us debug this: >Possible unsafe locking scenario: > > CPU0 > ---- > lock(&vi->rq[i].dim_lock); > lock(&vi->rq[i].dim_lock); > >*** DEADLOCK *** > > May be due to missing lock nesting notation > >3 locks held by ethtool/962: > #0: ffffffff82dbaab0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 > #1: ffffffff82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > ethnl_default_set_doit+0xbe/0x1e0 > >stack backtrace: >CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 >Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 >Call Trace: > <TASK> > dump_stack_lvl+0x79/0xb0 > check_deadlock+0x130/0x220 > __lock_acquire+0x861/0x990 > lock_acquire.part.0+0x72/0x1d0 > ? lock_acquire+0xf8/0x130 > __mutex_lock+0x71/0xd50 > virtnet_set_coalesce+0x151/0x190 > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > ethnl_set_coalesce+0x34/0x90 > ethnl_default_set_doit+0xdd/0x1e0 > genl_family_rcv_msg_doit+0xdc/0x130 > genl_family_rcv_msg+0x154/0x230 > ? __pfx_ethnl_default_set_doit+0x10/0x10 > genl_rcv_msg+0x4b/0xa0 > ? __pfx_genl_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x5a/0x110 > genl_rcv+0x28/0x40 > netlink_unicast+0x1af/0x280 > netlink_sendmsg+0x20e/0x460 > __sys_sendto+0x1fe/0x210 > ? find_held_lock+0x2b/0x80 > ? do_user_addr_fault+0x3a2/0x8a0 > ? __lock_release+0x5e/0x160 > ? do_user_addr_fault+0x3a2/0x8a0 > ? lock_release+0x72/0x140 > ? do_user_addr_fault+0x3a7/0x8a0 > __x64_sys_sendto+0x29/0x30 > do_syscall_64+0x78/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > >Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") >Signed-off-by: Heng Qi <hengqi@xxxxxxxxxxxxxxxxx> Reviewed-by: Jiri Pirko <jiri@xxxxxxxxxx>