Re: [regression, v6.0-rc0, io-uring?] filesystem freeze hangs on sb_wait_write()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/11/22 8:39 AM, Pavel Begunkov wrote:
> On 10/11/22 15:18, Jens Axboe wrote:
>> On 10/10/22 8:54 PM, Jens Axboe wrote:
>>> On 10/10/22 8:10 PM, Pavel Begunkov wrote:
>>>> On 10/11/22 03:01, Jens Axboe wrote:
>>>>> On 10/10/22 7:10 PM, Pavel Begunkov wrote:
>>>>>> On 10/11/22 01:40, Dave Chinner wrote:
>>>>>> [...]
>>>>>>> I note that there are changes to the the io_uring IO path and write
>>>>>>> IO end accounting in the io_uring stack that was merged, and there
>>>>>>> was no doubt about the success/failure of the reproducer at each
>>>>>>> step. Hence I think the bisect is good, and the problem is someone
>>>>>>> in the io-uring changes.
>>>>>>>
>>>>>>> Jens, over to you.
>>>>>>>
>>>>>>> The reproducer - generic/068 - is 100% reliable here, io_uring is
>>>>>>> being exercised by fsstress in the background whilst the filesystem
>>>>>>> is being frozen and thawed repeatedly. Some path in the io-uring
>>>>>>> code has an unbalanced sb_start_write()/sb_end_write() pair by the
>>>>>>> look of it....
>>>>>>
>>>>>> A quick guess, it's probably
>>>>>>
>>>>>> b000145e99078 ("io_uring/rw: defer fsnotify calls to task context")
>>>>>>
>>>>>> ?From a quick look, it removes? kiocb_end_write() -> sb_end_write()
>>>>>> from kiocb_done(), which is a kind of buffered rw completion path.
>>>>>
>>>>> Yeah, I'll take a look.
>>>>> Didn't get the original email, only Pavel's reply?
>>>>
>>>> Forwarded.
>>>
>>> Looks like the email did get delivered, it just ended up in the
>>> fsdevel inbox.
>>
>> Nope, it was marked as spam by gmail...
>>
>>>> Not tested, but should be sth like below. Apart of obvious cases
>>>> like __io_complete_rw_common() we should also keep in mind
>>>> when we don't complete the request but ask for reissue with
>>>> REQ_F_REISSUE, that's for the first hunk
>>>
>>> Can we move this into a helper?
>>
>> Something like this? Not super happy with it, but...
> 
> Sounds good. Would be great to drop a comment why it's ok to move
> back io_req_io_end() into __io_complete_rw_common() under the
> io_rw_should_reissue() "if".

Agree, I'll add a comment and post this.

-- 
Jens Axboe





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux