On Thu, 2019-02-21 at 16:53 +-0800, Jason Yan wrote: +AD4 On 2019/2/20 23:18, Christoph Hellwig wrote: +AD4 +AD4 +AFs-fullquote removed, please follow proper mail etiquette+AF0 +AD4 +AD4 +AD4 +AD4 On Tue, Feb 19, 2019 at 08:56:28AM -0800, Bart Van Assche wrote: +AD4 +AD4 +AD4 regression in the SCSI sd driver due to the switch from the legacy block +AD4 +AD4 +AD4 layer to scsi-mq. The above patch introduces two atomic operations in the +AD4 +AD4 +AD4 hot path and hence would introduce a performance regression. I think this +AD4 +AD4 +AD4 can be avoided by making sure that sd+AF8-uninit+AF8-command() gets called before +AD4 +AD4 +AD4 the request tag is freed. What changes would be required to make the block +AD4 +AD4 +AD4 layer core call sd+AF8-uninit+AF8-command() before the request tag is freed? Would +AD4 +AD4 +AD4 introducing prep+AF8-rq+AF8-fn and unprep+AF8-rq+AF8-fn callbacks in struct blk+AF8-mq+AF8-ops and +AD4 +AD4 +AD4 making sure that the SCSI core sets these callback function pointers +AD4 +AD4 +AD4 appropriately be sufficient? Would such a change allow to simplify the NVMe +AD4 +AD4 +AD4 initiator driver? Are there any alternatives to this approach that are more +AD4 +AD4 +AD4 elegant? +AD4 +AD4 +AD4 +AD4 Additional indirect calls in the I/O fast path is something I'd rather +AD4 +AD4 avoid. But I don't fully understand the problem yet - where do +AD4 +AD4 we release a disk reference from blk+AF8-update+AF8-request? +AD4 +AD4 When userspace close the fd after blk+AF8-update+AF8-request() and before +AD4 scsi+AF8-mq+AF8-uninit+AF8-cmd(), a disk reference will be released. It is not the +AD4 blk+AF8-update+AF8-request() directly released it. +AD4 +AD4 close +AD4 -+AD4-sd+AF8-release +AD4 -+AD4-scsi+AF8-disk+AF8-put +AD4 -+AD4-scsi+AF8-disk+AF8-release +AD4 -+AD4-disk-+AD4-private+AF8-data +AD0 NULL+ADs +AD4 +AD4 The userspace can close the fd because blk+AF8-update+AF8-request() returned the +AD4 last IO , the userspace application does not have to stuck on read() or +AD4 write(). The window is very small, but it can be reproduce every day +AD4 in our testcases. So I'm very curious why. One possible explanation is +AD4 that we enabled kernel preempt(CONFIG+AF8-PREEMPT). +AD4 +AD4 And why can't we move that release to +AF8AXw-blk+AF8-mq+AF8-end+AF8-request? Hi Jason, What is the current status of this issue? Thanks, Bart.