On 2/17/25 15:37, Jens Axboe wrote:
On 2/17/25 7:08 AM, Pavel Begunkov wrote:
On 2/17/25 13:58, Jens Axboe wrote:
On 2/17/25 6:37 AM, Pavel Begunkov wrote:
The kiocb semantics of ki_complete == NULL -> sync kiocb is also odd,
That's what is_sync_kiocb() does. Would be cleaner to use
init_sync_kiocb(), but there is a larger chance to do sth
wrong as it's reinitialises it entirely.
Sorry if that wasn't clear, yeah I do realize this is what
is_sync_kiocb() checks. I do agree that manually clearing is saner.
but probably fine for this case as read mshot strictly deals with
pollable files. Otherwise you'd just be blocking off this issue,
regardless of whether or not IOCB_NOWAIT is set.
In any case, it'd be much nicer to container this in io_read_mshot()
where it arguably belongs, rather than sprinkle it in __io_read().
Possible?
That's what I tried first, but __io_read() -> io_rw_init_file()
reinitialises it, so I don't see any good way without some
broader refactoring.
Would it be bad? The only reason the kiocb init part is in there is
because of the ->iopoll() check, that could still be there with the rest
of the init going into normal prep (as it arguably should).
Something like the below, totally untested...
fwiw, turns out we can't move all of it like in the diff as
ki_flags setup and checks depend on having the file set.
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 16f12f94943f..f8dd9a9fe9ca 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -264,6 +264,9 @@ static int io_prep_rw_pi(struct io_kiocb *req, struct io_rw *rw, int ddir,
return ret;
}
+static void io_complete_rw(struct kiocb *kiocb, long res);
+static void io_complete_rw_iopoll(struct kiocb *kiocb, long res);
+
static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
int ddir, bool do_import)
{
@@ -288,6 +291,18 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
}
rw->kiocb.dio_complete = NULL;
rw->kiocb.ki_flags = 0;
+ if (req->ctx->flags & IORING_SETUP_IOPOLL) {
+ if (!(rw->kiocb.ki_flags & IOCB_DIRECT))
+ return -EOPNOTSUPP;
+
+ rw->kiocb.private = NULL;
+ rw->kiocb.ki_flags |= IOCB_HIPRI;
+ rw->kiocb.ki_complete = io_complete_rw_iopoll;
+ } else {
+ if (rw->kiocb.ki_flags & IOCB_HIPRI)
+ return -EINVAL;
+ rw->kiocb.ki_complete = io_complete_rw;
+ }
rw->addr = READ_ONCE(sqe->addr);
rw->len = READ_ONCE(sqe->len);
@@ -810,23 +825,15 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode, int rw_type)
((file->f_flags & O_NONBLOCK && !(req->flags & REQ_F_SUPPORT_NOWAIT))))
req->flags |= REQ_F_NOWAIT;
- if (ctx->flags & IORING_SETUP_IOPOLL) {
- if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
+ if (kiocb->ki_flags & IOCB_HIPRI) {
+ if (!file->f_op->iopoll)
return -EOPNOTSUPP;
-
- kiocb->private = NULL;
- kiocb->ki_flags |= IOCB_HIPRI;
- kiocb->ki_complete = io_complete_rw_iopoll;
req->iopoll_completed = 0;
if (ctx->flags & IORING_SETUP_HYBRID_IOPOLL) {
/* make sure every req only blocks once*/
req->flags &= ~REQ_F_IOPOLL_STATE;
req->iopoll_start = ktime_get_ns();
}
- } else {
- if (kiocb->ki_flags & IOCB_HIPRI)
- return -EINVAL;
- kiocb->ki_complete = io_complete_rw;
}
if (req->flags & REQ_F_HAS_METADATA) {
--
Pavel Begunkov