On 10/10/22 8:54 PM, Jens Axboe wrote: > On 10/10/22 8:10 PM, Pavel Begunkov wrote: >> On 10/11/22 03:01, Jens Axboe wrote: >>> On 10/10/22 7:10 PM, Pavel Begunkov wrote: >>>> On 10/11/22 01:40, Dave Chinner wrote: >>>> [...] >>>>> I note that there are changes to the the io_uring IO path and write >>>>> IO end accounting in the io_uring stack that was merged, and there >>>>> was no doubt about the success/failure of the reproducer at each >>>>> step. Hence I think the bisect is good, and the problem is someone >>>>> in the io-uring changes. >>>>> >>>>> Jens, over to you. >>>>> >>>>> The reproducer - generic/068 - is 100% reliable here, io_uring is >>>>> being exercised by fsstress in the background whilst the filesystem >>>>> is being frozen and thawed repeatedly. Some path in the io-uring >>>>> code has an unbalanced sb_start_write()/sb_end_write() pair by the >>>>> look of it.... >>>> >>>> A quick guess, it's probably >>>> >>>> b000145e99078 ("io_uring/rw: defer fsnotify calls to task context") >>>> >>>> ?From a quick look, it removes? kiocb_end_write() -> sb_end_write() >>>> from kiocb_done(), which is a kind of buffered rw completion path. >>> >>> Yeah, I'll take a look. >>> Didn't get the original email, only Pavel's reply? >> >> Forwarded. > > Looks like the email did get delivered, it just ended up in the > fsdevel inbox. Nope, it was marked as spam by gmail... >> Not tested, but should be sth like below. Apart of obvious cases >> like __io_complete_rw_common() we should also keep in mind >> when we don't complete the request but ask for reissue with >> REQ_F_REISSUE, that's for the first hunk > > Can we move this into a helper? Something like this? Not super happy with it, but... diff --git a/io_uring/rw.c b/io_uring/rw.c index 453e0ae92160..1c8d00f9af9f 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -234,11 +234,32 @@ static void kiocb_end_write(struct io_kiocb *req) } } +/* + * Trigger the notifications after having done some IO, and finish the write + * accounting, if any. + */ +static void io_req_io_end(struct io_kiocb *req) +{ + struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); + + if (rw->kiocb.ki_flags & IOCB_WRITE) { + kiocb_end_write(req); + fsnotify_modify(req->file); + } else { + fsnotify_access(req->file); + } +} + static bool __io_complete_rw_common(struct io_kiocb *req, long res) { if (unlikely(res != req->cqe.res)) { if ((res == -EAGAIN || res == -EOPNOTSUPP) && io_rw_should_reissue(req)) { + /* + * Reissue will start accounting again, finish the + * current cycle. + */ + io_req_io_end(req); req->flags |= REQ_F_REISSUE | REQ_F_PARTIAL_IO; return true; } @@ -264,15 +285,7 @@ static inline int io_fixup_rw_res(struct io_kiocb *req, long res) static void io_req_rw_complete(struct io_kiocb *req, bool *locked) { - struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); - - if (rw->kiocb.ki_flags & IOCB_WRITE) { - kiocb_end_write(req); - fsnotify_modify(req->file); - } else { - fsnotify_access(req->file); - } - + io_req_io_end(req); io_req_task_complete(req, locked); } @@ -317,6 +330,7 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret, req->file->f_pos = rw->kiocb.ki_pos; if (ret >= 0 && (rw->kiocb.ki_complete == io_complete_rw)) { if (!__io_complete_rw_common(req, ret)) { + io_req_io_end(req); io_req_set_res(req, final_ret, io_put_kbuf(req, issue_flags)); return IOU_OK; -- Jens Axboe