On 5/10/22 8:23 AM, Kanchan Joshi wrote: > On Thu, May 5, 2022 at 9:47 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 5/5/22 12:06 AM, Kanchan Joshi wrote: >>> +static int io_uring_cmd_prep(struct io_kiocb *req, >>> + const struct io_uring_sqe *sqe) >>> +{ >>> + struct io_uring_cmd *ioucmd = &req->uring_cmd; >>> + struct io_ring_ctx *ctx = req->ctx; >>> + >>> + if (ctx->flags & IORING_SETUP_IOPOLL) >>> + return -EOPNOTSUPP; >>> + /* do not support uring-cmd without big SQE/CQE */ >>> + if (!(ctx->flags & IORING_SETUP_SQE128)) >>> + return -EOPNOTSUPP; >>> + if (!(ctx->flags & IORING_SETUP_CQE32)) >>> + return -EOPNOTSUPP; >>> + if (sqe->ioprio || sqe->rw_flags) >>> + return -EINVAL; >>> + ioucmd->cmd = sqe->cmd; >>> + ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); >>> + return 0; >>> +} >> >> While looking at the other suggested changes, I noticed a more >> fundamental issue with the passthrough support. For any other command, >> SQE contents are stable once prep has been done. The above does do that >> for the basic items, but this case is special as the lower level command >> itself resides in the SQE. >> >> For cases where the command needs deferral, it's problematic. There are >> two main cases where this can happen: >> >> - The issue attempt yields -EAGAIN (we ran out of requests, etc). If you >> look at other commands, if they have data that doesn't fit in the >> io_kiocb itself, then they need to allocate room for that data and have >> it be persistent > > While we have io-wq retrying for this case, async_data is not allocated. > We need to do that explicitly inside io_uring_cmd(). Something like this - > > if (ret == -EAGAIN) { > if (!req_has_async_data(req)) { > if (io_alloc_async_data(req)) return -ENOMEM; > io_uring_cmd_prep_async(req); > } > return ret; > } > >> - Deferral is specified by the application, using eg IOSQE_IO_LINK or >> IOSQE_ASYNC. > For this to work, we are missing ".needs_async_setup = 1" for > IORING_OP_URING_CMD. Agree on both, the op handler itself should alloc the async_data for this case and that flag does need to be set. -- Jens Axboe