On 6/29/21 9:49 AM, Ming Lei wrote:
Hi,
blk_mq_alloc_request_hctx() is used by NVMe fc/rdma/tcp/loop to connect
io queue. Also the sw ctx is chosen as the 1st online cpu in hctx->cpumask.
However, all cpus in hctx->cpumask may be offline.
This usage model isn't well supported by blk-mq which supposes allocator is
always done on one online CPU in hctx->cpumask. This assumption is
related with managed irq, which also requires blk-mq to drain inflight
request in this hctx when the last cpu in hctx->cpumask is going to
offline.
However, NVMe fc/rdma/tcp/loop don't use managed irq, so we should allow
them to ask for request allocation when the specified hctx is inactive
(all cpus in hctx->cpumask are offline).
Fix blk_mq_alloc_request_hctx() by adding/passing flag of
BLK_MQ_F_NOT_USE_MANAGED_IRQ.
Ming Lei (2):
blk-mq: not deactivate hctx if the device doesn't use managed irq
nvme: pass BLK_MQ_F_NOT_USE_MANAGED_IRQ for fc/rdma/tcp/loop
block/blk-mq.c | 6 +++++-
drivers/nvme/host/fc.c | 3 ++-
drivers/nvme/host/rdma.c | 3 ++-
drivers/nvme/host/tcp.c | 3 ++-
drivers/nvme/target/loop.c | 3 ++-
include/linux/blk-mq.h | 1 +
6 files changed, 14 insertions(+), 5 deletions(-)
Cc: Sagi Grimberg <sagi@xxxxxxxxxxx>
Cc: Daniel Wagner <dwagner@suse. thede>
Cc: Wen Xiong <wenxiong@xxxxxxxxxx>
Cc: John Garry <john.garry@xxxxxxxxxx>
I have my misgivings about this patchset.
To my understanding, only CPUs present in the hctx cpumask are eligible
to submit I/O to that hctx.
Consequently if all cpus in that mask are offline, where is the point of
even transmitting a 'connect' request?
Shouldn't we rather modify the tagset to only refer to the current
online CPUs _only_, thereby never submit a connect request for hctx with
only offline CPUs?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer