On 5/18/21 10:55 AM, James Bottomley wrote: > On Tue, 2021-05-18 at 10:44 -0700, Bart Van Assche wrote: >> Hi Martin, >> >> This patch series implements the following two changes for all SCSI >> drivers: >> - Use blk_mq_rq_from_pdu() instead of the request member of struct >> scsi_cmnd >> since adding an offset to a pointer is faster than pointer >> indirection. > > Are there any performance results to back up this assertion? It's > quite a lot of churn so it would be nice to know it's worth it. I have not yet run any performance measurements because I expect that it will be challenging to measure the performance impact of a change like this one accurately. The performance measurement tool itself (e.g. fio) might introduce more variation between runs than the performance improvement of this patch series. Another reason I have not yet run any performance measurements is because I was assuming that everyone would be happy with a patch series that makes code faster and that reduces the size of a key SCSI data structure. Anyway, I have run 'make drivers/scsi/scsi_lib.lst' with and without this patch series applied. What I see is that without this patch series the assembly code for converting a SCSI command pointer into a request pointer looks like this: 48 8b bb 10 01 00 00 mov 0x110(%rbx),%rdi With this patch series applied that conversion code changes into the following: 48 8d bb f0 fe ff ff lea -0x110(%rbx),%rdi The above shows that struct request has a size of 0x110 = 272 bytes with my kernel configuration. This illustrates that this patch series realizes an improvement since "mov" instructions used for converting SCSI command pointers into struct request pointers are converted into "lea" instructions. "mov" fetches data from memory while "lea" does not. Bart.