On 12/16/21 6:06 AM, Max Gurtovoy wrote: > > On 12/16/2021 11:08 AM, Christoph Hellwig wrote: >> On Wed, Dec 15, 2021 at 09:24:21AM -0700, Jens Axboe wrote: >>> + spin_lock(&nvmeq->sq_lock); >>> + while (!rq_list_empty(*rqlist)) { >>> + struct request *req = rq_list_pop(rqlist); >>> + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); >>> + >>> + memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes), >>> + absolute_pointer(&iod->cmd), sizeof(iod->cmd)); >>> + if (++nvmeq->sq_tail == nvmeq->q_depth) >>> + nvmeq->sq_tail = 0; >> So this doesn't even use the new helper added in patch 2? I think this >> should call nvme_sq_copy_cmd(). > > I also noticed that. > > So need to decide if to open code it or use the helper function. > > Inline helper sounds reasonable if you have 3 places that will use it. Yes agree, that's been my stance too :-) >> The rest looks identical to the incremental patch I posted, so I guess >> the performance degration measured on the first try was a measurement >> error? > > giving 1 dbr for a batch of N commands sounds good idea. Also for RDMA host. > > But how do you moderate it ? what is the batch_sz <--> time_to_wait > algorithm ? The batching is naturally limited at BLK_MAX_REQUEST_COUNT, which is 32 in total. I do agree that if we ever made it much larger, then we might want to cap it differently. But 32 seems like a pretty reasonable number to get enough gain from the batching done in various areas, while still not making it so large that we have a potential latency issue. That batch count is already used consistently for other items too (like tag allocation), so it's not specific to just this one case. -- Jens Axboe