On 12/16/21 2:08 AM, Christoph Hellwig wrote: > On Wed, Dec 15, 2021 at 09:24:21AM -0700, Jens Axboe wrote: >> + spin_lock(&nvmeq->sq_lock); >> + while (!rq_list_empty(*rqlist)) { >> + struct request *req = rq_list_pop(rqlist); >> + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); >> + >> + memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes), >> + absolute_pointer(&iod->cmd), sizeof(iod->cmd)); >> + if (++nvmeq->sq_tail == nvmeq->q_depth) >> + nvmeq->sq_tail = 0; > > So this doesn't even use the new helper added in patch 2? I think this > should call nvme_sq_copy_cmd(). But you NAK'ed that one? It definitely should use that helper, so I take it you are fine with it then if we do it here too? That would make 3 call sites, and I still do think the helper makes sense... > The rest looks identical to the incremental patch I posted, so I guess > the performance degration measured on the first try was a measurement > error? It may have been a measurement error, I'm honestly not quite sure. I reshuffled and modified a few bits here and there, and verified the end result. Wish I had a better answer, but... -- Jens Axboe