YEah I saw this as well, will try to bisect soon On Tue, 2022-05-17 at 13:10 +0200, Hans de Goede wrote: > Hi All, > > I just noticed the below lockdep possible deadlock report with a 5.18-rc6 > kernel on a Dell Latitude E6430 laptop with the following nvidia GPU: > > 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF108GLM [NVS > 5200M] [10de:0dfc] (rev a1) > 01:00.1 Audio device [0403]: NVIDIA Corporation GF108 High Definition Audio > Controller [10de:0bea] (rev a1) > > This is with the laptop in Optimus mode, so with the Intel integrated > gfx from the i5-3320M CPU driving the LCD panel and with nothing connected > to the HDMI connector, which is always routed to the NVIDIA GPU on this > laptop. > > The lockdep possible deadlock warning seems to happen when the NVIDIA GPU > is runtime suspended shortly after gdm has loaded: > > [ 24.859171] ====================================================== > [ 24.859173] WARNING: possible circular locking dependency detected > [ 24.859175] 5.18.0-rc6+ #34 Tainted: G E > [ 24.859178] ------------------------------------------------------ > [ 24.859179] kworker/1:1/46 is trying to acquire lock: > [ 24.859181] ffff92b0c0ee0518 (&cli->mutex){+.+.}-{3:3}, at: > nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859231] > but task is already holding lock: > [ 24.859233] ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}-{3:3}, > at: ttm_bo_wait+0x7d/0x140 [ttm] > [ 24.859243] > which lock already depends on the new lock. > > [ 24.859244] > the existing dependency chain (in reverse order) is: > [ 24.859246] > -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}: > [ 24.859249] __ww_mutex_lock.constprop.0+0xb3/0xfb0 > [ 24.859256] ww_mutex_lock+0x38/0xa0 > [ 24.859259] nouveau_bo_pin+0x30/0x380 [nouveau] > [ 24.859297] nouveau_channel_del+0x1d7/0x3e0 [nouveau] > [ 24.859328] nouveau_channel_new+0x48/0x730 [nouveau] > [ 24.859358] nouveau_abi16_ioctl_channel_alloc+0x113/0x360 > [nouveau] > [ 24.859389] drm_ioctl_kernel+0xa1/0x150 > [ 24.859392] drm_ioctl+0x21c/0x410 > [ 24.859395] nouveau_drm_ioctl+0x56/0x1820 [nouveau] > [ 24.859431] __x64_sys_ioctl+0x8d/0xc0 > [ 24.859436] do_syscall_64+0x5b/0x80 > [ 24.859440] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 24.859443] > -> #0 (&cli->mutex){+.+.}-{3:3}: > [ 24.859446] __lock_acquire+0x12e2/0x1f90 > [ 24.859450] lock_acquire+0xad/0x290 > [ 24.859453] __mutex_lock+0x90/0x830 > [ 24.859456] nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859493] ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm] > [ 24.859498] ttm_mem_evict_first+0x25c/0x4b0 [ttm] > [ 24.859503] ttm_resource_manager_evict_all+0x93/0x1b0 [ttm] > [ 24.859509] nouveau_debugfs_fini+0x161/0x260 [nouveau] > [ 24.859545] nouveau_drm_ioctl+0xa4a/0x1820 [nouveau] > [ 24.859582] pci_pm_runtime_suspend+0x5c/0x180 > [ 24.859585] __rpm_callback+0x48/0x1b0 > [ 24.859589] rpm_callback+0x5a/0x70 > [ 24.859591] rpm_suspend+0x10a/0x6f0 > [ 24.859594] pm_runtime_work+0xa0/0xb0 > [ 24.859596] process_one_work+0x254/0x560 > [ 24.859601] worker_thread+0x4f/0x390 > [ 24.859604] kthread+0xe6/0x110 > [ 24.859607] ret_from_fork+0x22/0x30 > [ 24.859611] > other info that might help us debug this: > > [ 24.859612] Possible unsafe locking scenario: > > [ 24.859613] CPU0 CPU1 > [ 24.859615] ---- ---- > [ 24.859616] lock(reservation_ww_class_mutex); > [ 24.859618] lock(&cli->mutex); > [ 24.859620] > lock(reservation_ww_class_mutex); > [ 24.859622] lock(&cli->mutex); > [ 24.859624] > *** DEADLOCK *** > > [ 24.859625] 3 locks held by kworker/1:1/46: > [ 24.859627] #0: ffff92b0c0bb4338 ((wq_completion)pm){+.+.}-{0:0}, at: > process_one_work+0x1d0/0x560 > [ 24.859634] #1: ffffa8ffc02dfe80 ((work_completion)(&dev- > >power.work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x560 > [ 24.859641] #2: ffff92b0c4bf35a0 (reservation_ww_class_mutex){+.+.}- > {3:3}, at: ttm_bo_wait+0x7d/0x140 [ttm] > [ 24.859649] > stack backtrace: > [ 24.859651] CPU: 1 PID: 46 Comm: kworker/1:1 Tainted: G E > 5.18.0-rc6+ #34 > [ 24.859654] Hardware name: Dell Inc. Latitude E6430/0H3MT5, BIOS A21 > 05/08/2017 > [ 24.859656] Workqueue: pm pm_runtime_work > [ 24.859660] Call Trace: > [ 24.859662] <TASK> > [ 24.859665] dump_stack_lvl+0x5b/0x74 > [ 24.859669] check_noncircular+0xdf/0x100 > [ 24.859672] ? register_lock_class+0x38/0x470 > [ 24.859678] __lock_acquire+0x12e2/0x1f90 > [ 24.859683] lock_acquire+0xad/0x290 > [ 24.859686] ? nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859724] ? lock_is_held_type+0xa6/0x120 > [ 24.859730] __mutex_lock+0x90/0x830 > [ 24.859733] ? nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859770] ? nvif_vmm_map+0x114/0x130 [nouveau] > [ 24.859791] ? nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859829] ? nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859866] nouveau_vga_lastclose+0x910/0x1030 [nouveau] > [ 24.859905] ttm_bo_move_to_lru_tail+0x32c/0x980 [ttm] > [ 24.859912] ttm_mem_evict_first+0x25c/0x4b0 [ttm] > [ 24.859919] ? lock_release+0x20/0x2a0 > [ 24.859923] ttm_resource_manager_evict_all+0x93/0x1b0 [ttm] > [ 24.859930] nouveau_debugfs_fini+0x161/0x260 [nouveau] > [ 24.859968] nouveau_drm_ioctl+0xa4a/0x1820 [nouveau] > [ 24.860005] pci_pm_runtime_suspend+0x5c/0x180 > [ 24.860008] ? pci_dev_put+0x20/0x20 > [ 24.860011] __rpm_callback+0x48/0x1b0 > [ 24.860014] ? pci_dev_put+0x20/0x20 > [ 24.860018] rpm_callback+0x5a/0x70 > [ 24.860020] ? pci_dev_put+0x20/0x20 > [ 24.860023] rpm_suspend+0x10a/0x6f0 > [ 24.860025] ? process_one_work+0x1d0/0x560 > [ 24.860031] pm_runtime_work+0xa0/0xb0 > [ 24.860034] process_one_work+0x254/0x560 > [ 24.860039] worker_thread+0x4f/0x390 > [ 24.860043] ? process_one_work+0x560/0x560 > [ 24.860046] kthread+0xe6/0x110 > [ 24.860049] ? kthread_complete_and_exit+0x20/0x20 > [ 24.860053] ret_from_fork+0x22/0x30 > [ 24.860059] </TASK> > > Regards, > > Hans > > -- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat