On Sun, Dec 16, 2018 at 12:12:17PM -0700, Jens Axboe wrote: > On 12/16/18 11:39 AM, Mike Snitzer wrote: > > On Sun, Dec 16 2018 at 11:16am -0500, > > Christoph Hellwig <hch@xxxxxx> wrote: > > > >> On Sun, Dec 16, 2018 at 10:25:16AM +0800, Ming Lei wrote: > >>> This patch sets map->nr_queues as zero explictly if there is zero > >>> queues for such queue type, then blk_mq_map_swqueue() can become > >>> more robust to deal with shared mappings. > >> > >> This looks a lot more clumsy than what we had before, can you explain > >> what additional robustnes it buys us? > > > > It enables nvme IO to complete on my testbed with for-4.21/block > > changes, this NUMA layout is what triggered Ming's work: > > > > # numactl --hardware > > available: 2 nodes (0-1) > > node 0 cpus: 0 2 4 6 8 10 12 14 > > node 0 size: 128605 MB > > node 0 free: 128092 MB > > node 1 cpus: 1 3 5 7 9 11 13 15 > > node 1 size: 128997 MB > > node 1 free: 128529 MB > > node distances: > > node 0 1 > > 0: 10 21 > > 1: 21 10 > > > > Without the aggregate changes from this patchset (1-3 anyway) I get IO > > hangs in blkdev_fsync(). > > Still puzzled. I'm not against the change, but the commit message has > NOTHING in the way of justification. "Make X more robust" doesn't > mean anything. Your followup brings no extra info to the table in > terms of what the bug is here. What exact bug is it fixing? Why is > fsync currently hanging? As I explained in previous mail, poll queue uses new mapping via blk_mq_map_queues(), then blk_mq_map_swqueue() may over-write hctx->type by this new mapping, finally the queue mapping is totally broken. Thanks, Ming