On Tue, 2019-04-02 at 19:05 +-0800, Ming Lei wrote: +AD4 On Tue, Apr 02, 2019 at 04:07:04PM +-0800, jianchao.wang wrote: +AD4 +AD4 percpu+AF8-ref is born for fast path. +AD4 +AD4 There are some drivers use it in completion path, such as scsi, does it really +AD4 +AD4 matter for this kind of device ? If yes, I guess we should remove blk+AF8-mq+AF8-run+AF8-hw+AF8-queues +AD4 +AD4 which is the really bulk and depend on hctx restart mechanism. +AD4 +AD4 Yes, it is designed for fast path, but it doesn't mean percpu+AF8-ref +AD4 hasn't any cost. blk+AF8-mq+AF8-run+AF8-hw+AF8-queues() is called for all blk-mq devices, +AD4 includes the fast NVMe. I think the overhead of adding a percpu+AF8-ref+AF8-get/put pair is acceptable for SCSI drivers. The NVMe driver doesn't call blk+AF8-mq+AF8-run+AF8-hw+AF8-queues() directly. Additionally, I don't think that any of the blk+AF8-mq+AF8-run+AF8-hw+AF8-queues() calls from the block layer matter for the fast path code in the NVMe driver. In other words, adding a percpu+AF8-ref+AF8-get/put pair in blk+AF8-mq+AF8-run+AF8-hw+AF8-queues() shouldn't affect the performance of the NVMe driver. +AD4 Also: +AD4 +AD4 It may not be enough to just grab the percpu+AF8-ref for blk+AF8-mq+AF8-run+AF8-hw+AF8-queues +AD4 only, given the idea is to use the percpu+AF8-ref to protect hctx's resources. +AD4 +AD4 There are lots of uses on 'hctx', such as other exported blk-mq APIs. +AD4 If this approach were chosen, we may have to audit other blk-mq APIs, +AD4 cause they might be called after queue is frozen too. The only blk+AF8-mq+AF8-hw+AF8-ctx user I have found so far that needs additional protection is the q-+AD4-mq+AF8-ops-+AD4-poll() call in blk+AF8-poll(). However, that is not a new issue. Functions like nvme+AF8-poll() access data structures (NVMe completion queue) that shouldn't be accessed while blk+AF8-cleanup+AF8-queue() is in progress. If blk+AF8-poll() is modified such that it becomes safe to call that function while blk+AF8-cleanup+AF8-queue() is in progress then blk+AF8-poll() won't access any hardware queue that it shouldn't access. Bart.