On 10/10/22 8:10 PM, Pavel Begunkov wrote: > On 10/11/22 03:01, Jens Axboe wrote: >> On 10/10/22 7:10 PM, Pavel Begunkov wrote: >>> On 10/11/22 01:40, Dave Chinner wrote: >>> [...] >>>> I note that there are changes to the the io_uring IO path and write >>>> IO end accounting in the io_uring stack that was merged, and there >>>> was no doubt about the success/failure of the reproducer at each >>>> step. Hence I think the bisect is good, and the problem is someone >>>> in the io-uring changes. >>>> >>>> Jens, over to you. >>>> >>>> The reproducer - generic/068 - is 100% reliable here, io_uring is >>>> being exercised by fsstress in the background whilst the filesystem >>>> is being frozen and thawed repeatedly. Some path in the io-uring >>>> code has an unbalanced sb_start_write()/sb_end_write() pair by the >>>> look of it.... >>> >>> A quick guess, it's probably >>> >>> b000145e99078 ("io_uring/rw: defer fsnotify calls to task context") >>> >>> From a quick look, it removes kiocb_end_write() -> sb_end_write() >>> from kiocb_done(), which is a kind of buffered rw completion path. >> >> Yeah, I'll take a look. >> Didn't get the original email, only Pavel's reply? > > Forwarded. Looks like the email did get delivered, it just ended up in the fsdevel inbox. > Not tested, but should be sth like below. Apart of obvious cases > like __io_complete_rw_common() we should also keep in mind > when we don't complete the request but ask for reissue with > REQ_F_REISSUE, that's for the first hunk Can we move this into a helper? -- Jens Axboe