On 3/16/24 16:51, Jens Axboe wrote:
On 3/16/24 10:46 AM, Pavel Begunkov wrote:
On 3/16/24 16:42, Jens Axboe wrote:
On 3/16/24 10:36 AM, Pavel Begunkov wrote:
On 3/16/24 16:36, Jens Axboe wrote:
On 3/16/24 10:32 AM, Pavel Begunkov wrote:
On 3/16/24 16:31, Jens Axboe wrote:
On 3/16/24 10:28 AM, Pavel Begunkov wrote:
On 3/16/24 16:14, Jens Axboe wrote:
On 3/15/24 5:28 PM, Pavel Begunkov wrote:
On 3/15/24 23:25, Jens Axboe wrote:
On 3/15/24 5:19 PM, Pavel Begunkov wrote:
On 3/15/24 23:13, Pavel Begunkov wrote:
On 3/15/24 23:09, Pavel Begunkov wrote:
On 3/15/24 22:48, Jens Axboe wrote:
If we get a request with IOSQE_ASYNC set, then we first run the prep
async handlers. But if we then fail setting it up and want to post
a CQE with -EINVAL, we use ->done_io. This was previously guarded with
REQ_F_PARTIAL_IO, and the normal setup handlers do set it up before any
potential errors, but we need to cover the async setup too.
You can hit io_req_defer_failed() { opdef->fail(); }
off of an early submission failure path where def->prep has
not yet been called, I don't think the patch will fix the
problem.
->fail() handlers are fragile, maybe we should skip them
if def->prep() wasn't called. Not even compile tested:
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 846d67a9c72e..56eed1490571 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
[...]
def->fail(req);
io_req_complete_defer(req);
}
@@ -2201,8 +2201,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
}
req->flags |= REQ_F_CREDS;
}
-
- return def->prep(req, sqe);
+ return 0;
}
static __cold int io_submit_fail_init(const struct io_uring_sqe *sqe,
@@ -2250,8 +2249,15 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
int ret;
ret = io_init_req(ctx, req, sqe);
- if (unlikely(ret))
+ if (unlikely(ret)) {
+fail:
Obvious the diff is crap, but still bugging me enough to write
that the label should've been one line below, otherwise we'd
flag after ->prep as well.
It certainly needs testing :-)
We can go either way - patch up the net thing, or do a proper EARLY_FAIL
and hopefully not have to worry about it again. Do you want to clean it
up, test it, and send it out?
I'd rather leave it to you, I suspect it wouldn't fix the syzbot
report w/o fiddling with done_io as in your patch.
I gave this a shot, but some fail handlers do want to get called. But
Which one and/or which part of it?
send zc
I don't think so. If prep wasn't called there wouldn't be
a notif allocated, and so no F_MORE required. If you take
at the code path it's under REQ_F_NEED_CLEANUP, which is only
set by opcode handlers
I'm not making this up, your test case will literally fail as it doesn't
get to flag MORE for that case. FWIW, this was done with EARLY_FAIL
being flagged, and failing if we fail during or before prep.
Maybe the test is too strict, but your approach is different
from what I mentioned yesterday
- return def->prep(req, sqe);
+ ret = def->prep(req, sqe);
+ if (unlikely(ret)) {
+ req->flags |= REQ_F_EARLY_FAIL;
+ return ret;
+ }
+
+ return 0;
It should only set REQ_F_EARLY_FAIL if we fail
_before_ prep is called
I did try both ways, fails if we just have:
Ok, but the point is that the sendzc's ->fail doesn't
need to be called unless you've done ->prep first.
But it fails, not sure how else to say it.
liburing tests? Which test case? If so, it should be another
bug. REQ_F_NEED_CLEANUP is only set by opcodes, if a request is
terminated before ->prep is called, it means it never entered
any of the opdef callbacks and have never seen any of net.c
code, so there should be no REQ_F_NEED_CLEANUP, and so
io_sendrecv_fail() wouldn't try to set F_MORE. I don't know
what's wrong.
FWIW, the current io_uring-6.9 branch has two patches on top, looks fine
for me so far. We'll see if syzbot agrees. I'll send them out later
today, unless I change my mind and try a different approach.
--
Pavel Begunkov