On 5/5/22 7:48 AM, Ming Lei wrote: > On Thu, May 05, 2022 at 06:52:25AM -0600, Jens Axboe wrote: >> On 5/5/22 12:06 AM, Kanchan Joshi wrote: >>> From: Jens Axboe <axboe@xxxxxxxxx> >>> >>> file_operations->uring_cmd is a file private handler. >>> This is somewhat similar to ioctl but hopefully a lot more sane and >>> useful as it can be used to enable many io_uring capabilities for the >>> underlying operation. >>> >>> IORING_OP_URING_CMD is a file private kind of request. io_uring doesn't >>> know what is in this command type, it's for the provider of ->uring_cmd() >>> to deal with. This operation can be issued only on the ring that is >>> setup with both IORING_SETUP_SQE128 and IORING_SETUP_CQE32 flags. >> >> One thing that occured to me that I think we need to change is what you >> mention above, code here: >> >>> +static int io_uring_cmd_prep(struct io_kiocb *req, >>> + const struct io_uring_sqe *sqe) >>> +{ >>> + struct io_uring_cmd *ioucmd = &req->uring_cmd; >>> + struct io_ring_ctx *ctx = req->ctx; >>> + >>> + if (ctx->flags & IORING_SETUP_IOPOLL) >>> + return -EOPNOTSUPP; >>> + /* do not support uring-cmd without big SQE/CQE */ >>> + if (!(ctx->flags & IORING_SETUP_SQE128)) >>> + return -EOPNOTSUPP; >>> + if (!(ctx->flags & IORING_SETUP_CQE32)) >>> + return -EOPNOTSUPP; >>> + if (sqe->ioprio || sqe->rw_flags) >>> + return -EINVAL; >>> + ioucmd->cmd = sqe->cmd; >>> + ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); >>> + return 0; >>> +} >> >> I've been thinking of this mostly in the context of passthrough for >> nvme, but it originally started as a generic feature to be able to wire >> up anything for these types of commands. The SQE128/CQE32 requirement is >> really an nvme passthrough restriction, we don't necessarily need this >> for any kind of URING_CMD. Ditto IOPOLL as well. These are all things >> that should be validated further down, but there's no way to do that >> currently. >> >> Let's not have that hold up merging this, but we do need it fixed up for >> 5.19-final so we don't have this restriction. Suggestions welcome... > > The validation has to be done in consumer of SQE128/CQE32(nvme). One > way is to add SQE128/CQE32 io_uring_cmd_flags and pass them via > ->uring_cmd(issue_flags). Right, that's what I tried to say, it needs to be validated further down as we can (and will) have URING_CMD users that don't care about any of those 3 things and can work fine with whatever sqe/cqe size we have. IOPOLL also only applies if the handler potentially can block. Using the issue_flags makes sense to me, it's probably the easiest approach. Doesn't take space in the command itself, and there's plenty of room in that flag space to pass in the ring sqe/cqe/iopoll state. -- Jens Axboe