On 1/15/25 6:10 AM, lizetao wrote: > While performing fault injection testing, a bug report was triggered: > > FAULT_INJECTION: forcing a failure. > name fail_usercopy, interval 1, probability 0, space 0, times 0 > CPU: 12 UID: 0 PID: 18795 Comm: defer.t Tainted: G O 6.13.0-rc6-gf2a0a37b174b #17 > Tainted: [O]=OOT_MODULE > Hardware name: linux,dummy-virt (DT) > Call trace: > show_stack+0x20/0x38 (C) > dump_stack_lvl+0x78/0x90 > dump_stack+0x1c/0x28 > should_fail_ex+0x544/0x648 > should_fail+0x14/0x20 > should_fail_usercopy+0x1c/0x28 > get_timespec64+0x7c/0x258 > __io_timeout_prep+0x31c/0x798 > io_link_timeout_prep+0x1c/0x30 > io_submit_sqes+0x59c/0x1d50 > __arm64_sys_io_uring_enter+0x8dc/0xfa0 > invoke_syscall+0x74/0x270 > el0_svc_common.constprop.0+0xb4/0x240 > do_el0_svc+0x48/0x68 > el0_svc+0x38/0x78 > el0t_64_sync_handler+0xc8/0xd0 > el0t_64_sync+0x198/0x1a0 > > The deadlock stack is as follows: > > io_cqring_wait+0xa64/0x1060 > __arm64_sys_io_uring_enter+0x46c/0xfa0 > invoke_syscall+0x74/0x270 > el0_svc_common.constprop.0+0xb4/0x240 > do_el0_svc+0x48/0x68 > el0_svc+0x38/0x78 > el0t_64_sync_handler+0xc8/0xd0 > el0t_64_sync+0x198/0x1a0 > > This is because after the submission fails, the defer.t testcase is still waiting to submit the failed request, resulting in an eventual deadlock. > Solve the problem by telling wait_cqes the number of requests to wait for. I suspect this would be fixed by setting IORING_SETUP_SUBMIT_ALL for ring init, something probably all/most tests should set. -- Jens Axboe