On Tue, Oct 12, 2021 at 12:13:52PM -0600, Jens Axboe wrote: > This memset in the fast path costs a lot of cycles on my setup. Here's a > top-of-profile of doing ~6.7M IOPS: > > + 5.90% io_uring [nvme] [k] nvme_queue_rq > + 5.32% io_uring [nvme_core] [k] nvme_setup_cmd > + 5.17% io_uring [kernel.vmlinux] [k] io_submit_sqes > + 4.97% io_uring [kernel.vmlinux] [k] blkdev_direct_IO > > and a perf diff with this patch: > > 0.92% +4.40% [nvme_core] [k] nvme_setup_cmd > > reducing it from 5.3% to only 0.9%. This takes it from the 2nd most > cycle consumer to something that's mostly irrelevant. > > Retain the full clear for the other commands to avoid doing any audits > there, and just clear the fields in the rw command manually that we > don't already fill. Oo, we knew about this optimization *years* ago, yet didn't do anything about it! Better late than never. http://lists.infradead.org/pipermail/linux-nvme/2014-May/000837.html Acked-by: Keith Busch <kbusch@xxxxxxxxxx>