I just bought one vm , so i cannot access to hypervisor. I will try to build the environment on my desktop. I'm sure about something. The hypervisor is KVM and disk driver is virtio-blk. [root@blk-mq ~]# dmesg |grep KVM [ 0.000000] Hypervisor detected: KVM [ 0.186330] Booting paravirtualized kernel on KVM [ 0.279106] KVM setup async PF for cpu 0 [ 0.420819] KVM setup async PF for cpu 1 [ 0.421682] KVM setup async PF for cpu 2 [ 0.422113] KVM setup async PF for cpu 3 [ 0.422434] KVM setup async PF for cpu 4 [ 0.422434] KVM setup async PF for cpu 5 [ 0.423312] KVM setup async PF for cpu 6 [ 0.423394] KVM setup async PF for cpu 7 [ 0.424125] KVM setup async PF for cpu 8 [ 0.424414] KVM setup async PF for cpu 9 [ 0.424415] KVM setup async PF for cpu 10 [ 0.425329] KVM setup async PF for cpu 11 [ 0.425420] KVM setup async PF for cpu 12 [ 0.426156] KVM setup async PF for cpu 13 [ 0.426431] KVM setup async PF for cpu 14 [ 0.426431] KVM setup async PF for cpu 15 [root@blk-mq ~]# lspci |grep block 00:05.0 SCSI storage controller: Red Hat, Inc. Virtio block device 00:06.0 SCSI storage controller: Red Hat, Inc. Virtio block device [root@blk-mq ~]# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 8 [root@blk-mq ~]# ls /sys/block/vdb/mq/ 0 1 2 3 [root@blk-mq ~]# cat /proc/cpuinfo |egrep 'processor|core id' processor : 0 core id : 0 processor : 1 core id : 0 processor : 2 core id : 1 processor : 3 core id : 1 processor : 4 core id : 2 processor : 5 core id : 2 processor : 6 core id : 3 processor : 7 core id : 3 processor : 8 core id : 4 processor : 9 core id : 4 processor : 10 core id : 5 processor : 11 core id : 5 processor : 12 core id : 6 processor : 13 core id : 6 processor : 14 core id : 7 processor : 15 core id : 7 --before this patch-- [root@blk-mq ~]# cat /sys/block/vdb/mq/*/cpu_list 0, 4, 5, 8, 9, 12, 13 1 2, 6, 7, 10, 11, 14, 15 3 --after this patch-- [root@blk-mq ~]# cat /sys/block/vdb/mq/*/cpu_list 0, 4, 5, 12, 13 1, 6, 7, 14, 15 2, 8, 9 3, 10, 11 I add dump_stack insert blk_mq_map_queues. [ 1.378680] Call Trace: [ 1.378690] dump_stack+0x5a/0x73 [ 1.378695] blk_mq_map_queues+0x29/0xb0 [ 1.378700] blk_mq_alloc_tag_set+0x1bd/0x2d0 [ 1.378705] virtblk_probe+0x1ae/0x8e4 [virtio_blk] [ 1.378709] virtio_dev_probe+0x18a/0x230 [virtio] [ 1.378713] really_probe+0x215/0x3f0 [ 1.378716] driver_probe_device+0x115/0x130 [ 1.378718] device_driver_attach+0x50/0x60 [ 1.378720] __driver_attach+0xbd/0x140 [ 1.378722] ? device_driver_attach+0x60/0x60 [ 1.378724] bus_for_each_dev+0x67/0xc0 [ 1.378727] ? klist_add_tail+0x57/0x70 At 2019-03-22 19:58:08, "Dongli Zhang" <dongli.zhang@xxxxxxxxxx> wrote: > > >On 03/22/2019 06:09 PM, luferry wrote: >> under virtual machine environment, cpu topology may differ from normal >> physical server. > >Would mind share the name of virtual machine monitor, the command line if >available, and which device to reproduce. > >For instance, I am not able to reproduce with qemu nvme or virtio-blk as I >assume they use pci or virtio specific mapper to establish the mapping. > >E.g., with qemu and nvme: -smp 8,sockets=1,cores=4,threads=2 > >Indeed I use three queues instead of twp as one is reserved for admin. > ># ls /sys/block/nvme0n1/mq/* >/sys/block/nvme0n1/mq/0: >cpu0 cpu1 cpu2 cpu3 cpu_list nr_reserved_tags nr_tags > >/sys/block/nvme0n1/mq/1: >cpu4 cpu5 cpu6 cpu7 cpu_list nr_reserved_tags nr_tags > > >Thank you very much! > >Dongli Zhang > >> for example (machine with 4 cores, 2 threads per core): >> >> normal physical server: >> core-id thread-0-id thread-1-id >> 0 0 4 >> 1 1 5 >> 2 2 6 >> 3 3 7 >> >> virtual machine: >> core-id thread-0-id thread-1-id >> 0 0 1 >> 1 2 3 >> 2 4 5 >> 3 6 7 >> >> When attach disk with two queues, all the even numbered cpus will be >> mapped to queue 0. Under virtual machine, all the cpus is followed by >> its sibling cpu.Before this patch, all the odd numbered cpus will also >> be mapped to queue 0, can cause serious imbalance.this will lead to >> performance impact on system IO >> >> So suggest to allocate cpu map by core id, this can be more currency >> >> Signed-off-by: luferry <luferry@xxxxxxx> >> --- >> block/blk-mq-cpumap.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c >> index 03a534820271..4125e8e77679 100644 >> --- a/block/blk-mq-cpumap.c >> +++ b/block/blk-mq-cpumap.c >> @@ -35,7 +35,7 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) >> { >> unsigned int *map = qmap->mq_map; >> unsigned int nr_queues = qmap->nr_queues; >> - unsigned int cpu, first_sibling; >> + unsigned int cpu, first_sibling, core = 0; >> >> for_each_possible_cpu(cpu) { >> /* >> @@ -48,9 +48,10 @@ int blk_mq_map_queues(struct blk_mq_queue_map *qmap) >> map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); >> } else { >> first_sibling = get_first_sibling(cpu); >> - if (first_sibling == cpu) >> - map[cpu] = cpu_to_queue_index(qmap, nr_queues, cpu); >> - else >> + if (first_sibling == cpu) { >> + map[cpu] = cpu_to_queue_index(qmap, nr_queues, core); >> + core++; >> + } else >> map[cpu] = map[first_sibling]; >> } >> } >>