On 03/22/2019 06:09 PM, luferry wrote: > under virtual machine environment, cpu topology may differ from normal > physical server. Would mind share the name of virtual machine monitor, the command line if available, and which device to reproduce. For instance, I am not able to reproduce with qemu nvme or virtio-blk as I assume they use pci or virtio specific mapper to establish the mapping. E.g., with qemu and nvme: -smp 8,sockets=1,cores=4,threads=2 Indeed I use three queues instead of twp as one is reserved for admin. # ls /sys/block/nvme0n1/mq/* /sys/block/nvme0n1/mq/0: cpu0 cpu1 cpu2 cpu3 cpu_list nr_reserved_tags nr_tags /sys/block/nvme0n1/mq/1: cpu4 cpu5 cpu6 cpu7 cpu_list nr_reserved_tags nr_tags Thank you very much! Dongli Zhang > for example (machine with 4 cores, 2 threads per core): > > normal physical server: > core-id thread-0-id thread-1-id > 0 0 4 > 1 1 5 > 2 2 6 > 3 3 7 > > virtual machine: > core-id thread-0-id thread-1-id > 0 0 1 > 1 2 3 > 2 4 5 > 3 6 7 > > When attach disk with two queues, all the even numbered cpus will be > mapped to queue 0. Under virtual machine, all the cpus is followed by > its sibling cpu.Before this patch, all the odd numbered cpus will also > be mapped to queue 0, can cause serious imbalance.this will lead to > performance impact on system IO > > So suggest to allocate cpu map by core id, this can be more currency > > Signed-off-by: luferry <luferry@xxxxxxx> > --- > block/blk-mq-cpumap.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c > index 03a534820271..4125e8e77679 100644 > --- a/block/blk-mq-cpumap.c > +++ b/block/blk-mq-cpumap.c > @@ -35,7 +35,7 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) > { > unsigned int *map = qmap->mq_map; > unsigned int nr_queues = qmap->nr_queues; > - unsigned int cpu, first_sibling; > + unsigned int cpu, first_sibling, core = 0; > > for_each_possible_cpu(cpu) { > /* > @@ -48,9 +48,10 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) > map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); > } else { > first_sibling = get_first_sibling(cpu); > - if (first_sibling == cpu) > - map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); > - else > + if (first_sibling == cpu) { > + map[cpu] = cpu_to_queue_index(qmap, nr_queues, core); > + core++; > + } else > map[cpu] = map[first_sibling]; > } > } >