On Thu, Mar 10, 2022 at 2:06 PM Christoph Hellwig <hch@xxxxxx> wrote: > > On Tue, Mar 08, 2022 at 08:51:05PM +0530, Kanchan Joshi wrote: > > From: Anuj Gupta <anuj20.g@xxxxxxxxxxx> > > > > On submission,just fetch the commmand from userspace pointer and reuse > > everything else. On completion, update the result field inside the > > passthru command. > > What is that supposed to mean? What is the reason to do it. Remember > to always document the why in commit logs. I covered some of it in patch 6, but yes I need to expand the reasoning here. Felt that retrospectively too. So there are two ways/modes of submitting commands: Mode 1: inline into sqe. This is the default way when passthru command is placed inside a big sqe which has 80 bytes of space. The only problem is - passthru command has this 'result' field (structure below for quick reference) which is statically embedded and not a pointer (like addr and metadata field). struct nvme_passthru_cmd64 { __u8 opcode; __u8 flags; __u16 rsvd1; __u32 nsid; __u32 cdw2; __u32 cdw3; __u64 metadata; __u64 addr; __u32 metadata_len; __u32 data_len; __u32 cdw10; __u32 cdw11; __u32 cdw12; __u32 cdw13; __u32 cdw14; __u32 cdw15; __u32 timeout_ms; __u32 rsvd2; __u64 result; }; In sync ioctl, we always update this result field by doing put_user on completion. For async ioctl, since command is inside the the sqe, its lifetime is only upto submission. SQE may get reused post submission, leaving no way to update the "result" field on completion. Had this field been a pointer, we could have saved this on submission and updated on completion. But that would require redesigning this structure and adding newer ioctl in nvme. Coming back, even though sync-ioctl alway updates this result to user-space, only a few nvme io commands (e.g. zone-append, copy, zone-mgmt-send) can return this additional result (spec-wise). Therefore in nvme, when we are dealing with inline-sqe commands from io_uring, we never attempt to update the result. And since we don't update the result, we limit support to only read/write passthru commands. And fail any other command during submission itself (Patch 2). Mode 2: Non-inline/indirect (pointer of command into sqe) submission. User-space places a pointer of passthru command, and a flag in io_uring saying that this is not inline. For this, in nvme (this patch) we always update the 'result' on completion and therefore can support all passthru commands. Hope this makes the reasoning clear? Plumbing wise, non-inline support does not create churn (almost all the infra of inline-command handling is reused). Extra is copy_from_user , and put_user. > > +static inline bool is_inline_rw(struct io_uring_cmd *ioucmd, struct nvme_command *cmd) > > Overly long line. Under 100, but sure, can fold it under 80. -- Kanchan