On 3/16/24 11:42 AM, Pavel Begunkov wrote: > On 3/16/24 17:01, Jens Axboe wrote: >> On 3/16/24 10:57 AM, Pavel Begunkov wrote: >>> On 3/16/24 16:51, Jens Axboe wrote: >>>> On 3/16/24 10:46 AM, Pavel Begunkov wrote: >>>>> On 3/16/24 16:42, Jens Axboe wrote: >>>>>> On 3/16/24 10:36 AM, Pavel Begunkov wrote: >>>>>>> On 3/16/24 16:36, Jens Axboe wrote: >>>>>>>> On 3/16/24 10:32 AM, Pavel Begunkov wrote: >>>>>>>>> On 3/16/24 16:31, Jens Axboe wrote: >>>>>>>>>> On 3/16/24 10:28 AM, Pavel Begunkov wrote: >>>>>>>>>>> On 3/16/24 16:14, Jens Axboe wrote: >>>>>>>>>>>> On 3/15/24 5:28 PM, Pavel Begunkov wrote: >>>>>>>>>>>>> On 3/15/24 23:25, Jens Axboe wrote: >>>>>>>>>>>>>> On 3/15/24 5:19 PM, Pavel Begunkov wrote: >>>>>>>>>>>>>>> On 3/15/24 23:13, Pavel Begunkov wrote: >>>>>>>>>>>>>>>> On 3/15/24 23:09, Pavel Begunkov wrote: >>>>>>>>>>>>>>>>> On 3/15/24 22:48, Jens Axboe wrote: >>>>>>>>>>>>>>>>>> If we get a request with IOSQE_ASYNC set, then we first run the prep >>>>>>>>>>>>>>>>>> async handlers. But if we then fail setting it up and want to post >>>>>>>>>>>>>>>>>> a CQE with -EINVAL, we use ->done_io. This was previously guarded with >>>>>>>>>>>>>>>>>> REQ_F_PARTIAL_IO, and the normal setup handlers do set it up before any >>>>>>>>>>>>>>>>>> potential errors, but we need to cover the async setup too. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> You can hit io_req_defer_failed() { opdef->fail(); } >>>>>>>>>>>>>>>>> off of an early submission failure path where def->prep has >>>>>>>>>>>>>>>>> not yet been called, I don't think the patch will fix the >>>>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ->fail() handlers are fragile, maybe we should skip them >>>>>>>>>>>>>>>>> if def->prep() wasn't called. Not even compile tested: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>>>>>>>>>>>>>>> index 846d67a9c72e..56eed1490571 100644 >>>>>>>>>>>>>>>>> --- a/io_uring/io_uring.c >>>>>>>>>>>>>>>>> +++ b/io_uring/io_uring.c >>>>>>>>>>>>>>> [...] >>>>>>>>>>>>>>>>> def->fail(req); >>>>>>>>>>>>>>>>> io_req_complete_defer(req); >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> @@ -2201,8 +2201,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> req->flags |= REQ_F_CREDS; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> - return def->prep(req, sqe); >>>>>>>>>>>>>>>>> + return 0; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> static __cold int io_submit_fail_init(const struct io_uring_sqe *sqe, >>>>>>>>>>>>>>>>> @@ -2250,8 +2249,15 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, >>>>>>>>>>>>>>>>> int ret; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ret = io_init_req(ctx, req, sqe); >>>>>>>>>>>>>>>>> - if (unlikely(ret)) >>>>>>>>>>>>>>>>> + if (unlikely(ret)) { >>>>>>>>>>>>>>>>> +fail: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Obvious the diff is crap, but still bugging me enough to write >>>>>>>>>>>>>>> that the label should've been one line below, otherwise we'd >>>>>>>>>>>>>>> flag after ->prep as well. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It certainly needs testing :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can go either way - patch up the net thing, or do a proper EARLY_FAIL >>>>>>>>>>>>>> and hopefully not have to worry about it again. Do you want to clean it >>>>>>>>>>>>>> up, test it, and send it out? >>>>>>>>>>>>> >>>>>>>>>>>>> I'd rather leave it to you, I suspect it wouldn't fix the syzbot >>>>>>>>>>>>> report w/o fiddling with done_io as in your patch. >>>>>>>>>>>> >>>>>>>>>>>> I gave this a shot, but some fail handlers do want to get called. But >>>>>>>>>>> >>>>>>>>>>> Which one and/or which part of it? >>>>>>>>>> >>>>>>>>>> send zc >>>>>>>>> >>>>>>>>> I don't think so. If prep wasn't called there wouldn't be >>>>>>>>> a notif allocated, and so no F_MORE required. If you take >>>>>>>>> at the code path it's under REQ_F_NEED_CLEANUP, which is only >>>>>>>>> set by opcode handlers >>>>>>>> >>>>>>>> I'm not making this up, your test case will literally fail as it doesn't >>>>>>>> get to flag MORE for that case. FWIW, this was done with EARLY_FAIL >>>>>>>> being flagged, and failing if we fail during or before prep. >>>>>>> >>>>>>> Maybe the test is too strict, but your approach is different >>>>>>> from what I mentioned yesterday >>>>>>> >>>>>>> - return def->prep(req, sqe); >>>>>>> + ret = def->prep(req, sqe); >>>>>>> + if (unlikely(ret)) { >>>>>>> + req->flags |= REQ_F_EARLY_FAIL; >>>>>>> + return ret; >>>>>>> + } >>>>>>> + >>>>>>> + return 0; >>>>>>> >>>>>>> It should only set REQ_F_EARLY_FAIL if we fail >>>>>>> _before_ prep is called >>>>>> >>>>>> I did try both ways, fails if we just have: >>>>> >>>>> Ok, but the point is that the sendzc's ->fail doesn't >>>>> need to be called unless you've done ->prep first. >>>> >>>> But it fails, not sure how else to say it. >>> >>> liburing tests? Which test case? If so, it should be another >> >> Like I mentioned earlier, it's send zc and it's failing the test case >> for that. test/send-zerocopy.t. >> >>> bug. REQ_F_NEED_CLEANUP is only set by opcodes, if a request is >>> terminated before ->prep is called, it means it never entered >>> any of the opdef callbacks and have never seen any of net.c >>> code, so there should be no REQ_F_NEED_CLEANUP, and so >>> io_sendrecv_fail() wouldn't try to set F_MORE. I don't know >>> what's wrong. >> >> Feel free to take a look! I do like the simplicity of the early error >> flag. > > ./send-zerocopy.t works fine Huh, I wonder what I messed up. But: > @@ -2250,7 +2249,13 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, > int ret; > > ret = io_init_req(ctx, req, sqe); > - if (unlikely(ret)) > + if (unlikely(ret)) { > + req->flags |= REQ_F_UNPREPPED_FAIL; > + return io_submit_fail_init(sqe, req, ret); > + } > + > + ret = def->prep(req, sqe); > + if (ret) > return io_submit_fail_init(sqe, req, ret); this obviously won't compile, assuming this is not the one you ran. In any case, I do like the one I sent out for review. It moves all the slow path out of line and shrinks things nicely too. And clearing the cmd.data area seems like a good idea for that case. -- Jens Axboe