Re: [PATCH] test/defer: fix deadlock when io_uring_submit fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/18/25 2:42 AM, lizetao wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Jens Axboe <axboe@xxxxxxxxx>
>> Sent: Thursday, January 16, 2025 10:51 PM
>> To: lizetao <lizetao1@xxxxxxxxxx>; Pavel Begunkov <asml.silence@xxxxxxxxx>
>> Cc: io-uring@xxxxxxxxxxxxxxx
>> Subject: Re: [PATCH] test/defer: fix deadlock when io_uring_submit fail
>>
>> On 1/15/25 6:10 AM, lizetao wrote:
>>> While performing fault injection testing, a bug report was triggered:
>>>
>>>   FAULT_INJECTION: forcing a failure.
>>>   name fail_usercopy, interval 1, probability 0, space 0, times 0
>>>   CPU: 12 UID: 0 PID: 18795 Comm: defer.t Tainted: G           O
>> 6.13.0-rc6-gf2a0a37b174b #17
>>>   Tainted: [O]=OOT_MODULE
>>>   Hardware name: linux,dummy-virt (DT)
>>>   Call trace:
>>>    show_stack+0x20/0x38 (C)
>>>    dump_stack_lvl+0x78/0x90
>>>    dump_stack+0x1c/0x28
>>>    should_fail_ex+0x544/0x648
>>>    should_fail+0x14/0x20
>>>    should_fail_usercopy+0x1c/0x28
>>>    get_timespec64+0x7c/0x258
>>>    __io_timeout_prep+0x31c/0x798
>>>    io_link_timeout_prep+0x1c/0x30
>>>    io_submit_sqes+0x59c/0x1d50
>>>    __arm64_sys_io_uring_enter+0x8dc/0xfa0
>>>    invoke_syscall+0x74/0x270
>>>    el0_svc_common.constprop.0+0xb4/0x240
>>>    do_el0_svc+0x48/0x68
>>>    el0_svc+0x38/0x78
>>>    el0t_64_sync_handler+0xc8/0xd0
>>>    el0t_64_sync+0x198/0x1a0
>>>
>>> The deadlock stack is as follows:
>>>
>>>   io_cqring_wait+0xa64/0x1060
>>>   __arm64_sys_io_uring_enter+0x46c/0xfa0
>>>   invoke_syscall+0x74/0x270
>>>   el0_svc_common.constprop.0+0xb4/0x240
>>>   do_el0_svc+0x48/0x68
>>>   el0_svc+0x38/0x78
>>>   el0t_64_sync_handler+0xc8/0xd0
>>>   el0t_64_sync+0x198/0x1a0
>>>
>>> This is because after the submission fails, the defer.t testcase is still waiting to
>> submit the failed request, resulting in an eventual deadlock.
>>> Solve the problem by telling wait_cqes the number of requests to wait for.
>>
>> I suspect this would be fixed by setting IORING_SETUP_SUBMIT_ALL for ring init,
>> something probably all/most tests should set.
> 
> 
> I tested it and found that IORING_SETUP_SUBMIT_ALL can indeed solve
> this problem. Should I just modify this problem or add
> IORING_SETUP_SUBMIT_ALL to the general path to solve most possible
> problems?

I think just fix up this one. We really should have all the tests use
t_create_ring*() first, and those helpers should just set SUBMIT_ALL.
But that's a separate change.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux