On Mon, Mar 14, 2022 at 09:53:56PM +0530, Kanchan Joshi wrote: >>> +struct nvme_uring_cmd_pdu { >>> + u32 meta_len; >>> + union { >>> + struct bio *bio; >>> + struct request *req; >>> + }; >>> + void *meta; /* kernel-resident buffer */ >>> + void __user *meta_buffer; >>> +} __packed; >> >> Why is this marked __packed? > Did not like doing it, but had to. > If not packed, this takes 32 bytes of space. While driver-pdu in struct > io_uring_cmd can take max 30 bytes. Packing nvme-pdu brought it down to > 28 bytes, which fits and gives 2 bytes back. What if you move meta_len to the end? Even if we need the __packed that will avoid all the unaligned access to pointers, which on some architectures will crash the kernel. > And on moving meta elements outside the driver, my worry is that it > reduces scope of uring-cmd infra and makes it nvme passthru specific. > At this point uring-cmd is still generic async ioctl/fsctl facility > which may find other users (than nvme-passthru) down the line. Organization > of fields within "struct io_uring_cmd" is around the rule > that a field is kept out (of 28 bytes pdu) only if is accessed by both > io_uring and driver. We have plenty of other interfaces of that kind. Sockets are one case already, and regular read/write with protection information will be another one. So having some core infrastrucure for "secondary data" seems very useful. > I see, so there is room for adding some efficiency. > Hope it will be ok if I carry this out as a separate effort. > Since this is about touching blk_mq_complete_request at its heart, and > improving sync-direct-io, this does not seem best-fit and slow this > series down. I really rather to this properly. Especially as the current effort adds new exported interfaces. > Deferring by ipi or softirq never occured. Neither for block nor for > char. Softirq is obvious since I was not running against scsi (or nvme with > single queue). I could not spot whether this is really a overhead, at > least for nvme. This tends to kick in if you have less queues than cpu cores. Quite command with either a high core cound or a not very high end nvme controller.