On 4/24/24 14:36, Jens Axboe wrote:
On 4/23/24 8:00 AM, Pavel Begunkov wrote:
On 4/22/24 14:35, Anuj Gupta wrote:
In case of write, the iov_iter gets updated before retry kicks in.
Restore the iov_iter before retrying. It can be reproduced by issuing
a write greater than device limit.
Fixes: df604d2ad480 (io_uring/rw: ensure retry condition isn't lost)
Signed-off-by: Anuj Gupta <anuj20.g@xxxxxxxxxxx>
---
io_uring/rw.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 4fed829fe97c..9fadb29ec34f 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -1035,8 +1035,10 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags)
else
ret2 = -EINVAL;
- if (req->flags & REQ_F_REISSUE)
+ if (req->flags & REQ_F_REISSUE) {
+ iov_iter_restore(&io->iter, &io->iter_state);
return IOU_ISSUE_SKIP_COMPLETE;
That's races with resubmission of the request, if it can happen from
io-wq that'd corrupt the iter. Nor I believe that the fix that this
patch fixes is correct, see
https://lore.kernel.org/linux-block/Zh505790%2FoufXqMn@fedora/T/#mb24d3dca84eb2d83878ea218cb0efaae34c9f026
Jens, I'd suggest to revert "io_uring/rw: ensure retry condition
isn't lost". I don't think we can sanely reissue from the callback
unless there are better ownership rules over kiocb and iter, e.g.
never touch the iter after calling the kiocb's callback.
It is a problem, but I don't believe it's a new one. If we revert the
existing fix, then we'll have to deal with the failure to end the IO due
to the (now) missing same thread group check, though. Which should be
My bad, I meant reverting the patch that removed thread group checks
together with its fixes.
doable, but would be nice to get this cleaned and cleared up once and
for all.
It's not like I'm in love with that chunk of code, if anything the
group check was quite feeble and quite, but replacing it with sth
clean but buggy is questionable...
Do you think it was broken before? Because I don't see any simple
way to fix it without propagating reissue back to io_read/write.
--
Pavel Begunkov