On Tue, Mar 08, 2022 at 08:50:53PM +0530, Kanchan Joshi wrote: > +/* > + * This overlays struct io_uring_cmd pdu. > + * Expect build errors if this grows larger than that. > + */ > +struct nvme_uring_cmd_pdu { > + u32 meta_len; > + union { > + struct bio *bio; > + struct request *req; > + }; > + void *meta; /* kernel-resident buffer */ > + void __user *meta_buffer; > +} __packed; Why is this marked __packed? In general I'd be much more happy if the meta elelements were a io_uring-level feature handled outside the driver and typesafe in struct io_uring_cmd, with just a normal private data pointer for the actual user, which would remove all the crazy casting. > +static void nvme_end_async_pt(struct request *req, blk_status_t err) > +{ > + struct io_uring_cmd *ioucmd = req->end_io_data; > + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > + /* extract bio before reusing the same field for request */ > + struct bio *bio = pdu->bio; > + > + pdu->req = req; > + req->bio = bio; > + /* this takes care of setting up task-work */ > + io_uring_cmd_complete_in_task(ioucmd, nvme_pt_task_cb); This is a bit silly. First we defer the actual request I/O completion from the block layer to a different CPU or softirq and then we have another callback here. I think it would be much more useful if we could find a way to enhance blk_mq_complete_request so that it could directly complet in a given task. That would also be really nice for say normal synchronous direct I/O. > + if (ioucmd) { /* async dispatch */ > + if (cmd->common.opcode == nvme_cmd_write || > + cmd->common.opcode == nvme_cmd_read) { No we can't just check for specific commands in the passthrough handler. > + nvme_setup_uring_cmd_data(req, ioucmd, meta, meta_buffer, > + meta_len); > + blk_execute_rq_nowait(req, 0, nvme_end_async_pt); > + return 0; > + } else { > + /* support only read and write for now. */ > + ret = -EINVAL; > + goto out_meta; > + } Pleae always handle error in the first branch and don't bother with an else after a goto or return. > +static int nvme_ns_async_ioctl(struct nvme_ns *ns, struct io_uring_cmd *ioucmd) > +{ > + int ret; > + > + BUILD_BUG_ON(sizeof(struct nvme_uring_cmd_pdu) > sizeof(ioucmd->pdu)); > + > + switch (ioucmd->cmd_op) { > + case NVME_IOCTL_IO64_CMD: > + ret = nvme_user_cmd64(ns->ctrl, ns, NULL, ioucmd); > + break; > + default: > + ret = -ENOTTY; > + } > + > + if (ret >= 0) > + ret = -EIOCBQUEUED; That's a weird way to handle the returns. Just return -EIOCBQUEUED directly from the handler (which as said before should be split from the ioctl handler anyway).