On Fri, Sep 12, 2014 at 11:21:37AM +0800, Zhang Haoyu wrote: > >>> > > If virtio-blk and virtio-serial share an IRQ, the guest operating system has to check each virtqueue for activity. Maybe there is some inefficiency doing that. > >>> > > AFAIK virtio-serial registers 64 virtqueues (on 31 ports + console) even if everything is unused. > >>> > > >>> > That could be the case if MSI is disabled. > >>> > >>> Do the windows virtio drivers enable MSIs, in their inf file? > >> > >>It depends on the version of the drivers, but it is a reasonable guess > >>at what differs between Linux and Windows. Haoyu, can you give us the > >>output of lspci from a Linux guest? > >> > >I made a test with fio on rhel-6.5 guest, the same degradation happened too, this degradation can be reproduced on rhel6.5 guest 100%. > >virtio_console module installed: > >64K-write-sequence: 285 MBPS, 4380 IOPS > >virtio_console module uninstalled: > >64K-write-sequence: 370 MBPS, 5670 IOPS > > > I use top -d 1 -H -p <qemu-pid> to monitor the cpu usage, and found that, > virtio_console module installed: > qemu main thread cpu usage: 98% > virtio_console module uninstalled: > qemu main thread cpu usage: 60% > > perf top -p <qemu-pid> result, > virtio_console module installed: > PerfTop: 9868 irqs/sec kernel:76.4% exact: 0.0% [4000Hz cycles], (target_pid: 88381) > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > 11.80% [kernel] [k] _raw_spin_lock_irqsave > 8.42% [kernel] [k] _raw_spin_unlock_irqrestore > 7.33% [kernel] [k] fget_light > 6.28% [kernel] [k] fput > 3.61% [kernel] [k] do_sys_poll > 3.30% qemu-system-x86_64 [.] qcow2_check_metadata_overlap > 3.10% [kernel] [k] __pollwait > 2.15% qemu-system-x86_64 [.] qemu_iohandler_poll > 1.44% libglib-2.0.so.0.3200.4 [.] g_array_append_vals > 1.36% libc-2.13.so [.] 0x000000000011fc2a > 1.31% libpthread-2.13.so [.] pthread_mutex_lock > 1.24% libglib-2.0.so.0.3200.4 [.] 0x000000000001f961 > 1.20% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt > 0.99% [kernel] [k] eventfd_poll > 0.98% [vdso] [.] 0x0000000000000771 > 0.97% [kernel] [k] remove_wait_queue > 0.96% qemu-system-x86_64 [.] qemu_iohandler_fill > 0.95% [kernel] [k] add_wait_queue > 0.69% [kernel] [k] __srcu_read_lock > 0.58% [kernel] [k] poll_freewait > 0.57% [kernel] [k] _raw_spin_lock_irq > 0.54% [kernel] [k] __srcu_read_unlock > 0.47% [kernel] [k] copy_user_enhanced_fast_string > 0.46% [kvm_intel] [k] vmx_vcpu_run > 0.46% [kvm] [k] vcpu_enter_guest > 0.42% [kernel] [k] tcp_poll > 0.41% [kernel] [k] system_call_after_swapgs > 0.40% libglib-2.0.so.0.3200.4 [.] g_slice_alloc > 0.40% [kernel] [k] system_call > 0.38% libpthread-2.13.so [.] 0x000000000000e18d > 0.38% libglib-2.0.so.0.3200.4 [.] g_slice_free1 > 0.38% qemu-system-x86_64 [.] address_space_translate_internal > 0.38% [kernel] [k] _raw_spin_lock > 0.37% qemu-system-x86_64 [.] phys_page_find > 0.36% [kernel] [k] get_page_from_freelist > 0.35% [kernel] [k] sock_poll > 0.34% [kernel] [k] fsnotify > 0.31% libglib-2.0.so.0.3200.4 [.] g_main_context_check > 0.30% [kernel] [k] do_direct_IO > 0.29% libpthread-2.13.so [.] pthread_getspecific > > virtio_console module uninstalled: > PerfTop: 9138 irqs/sec kernel:71.7% exact: 0.0% [4000Hz cycles], (target_pid: 88381) > ------------------------------------------------------------------------------------------------------------------------------ > > 5.72% qemu-system-x86_64 [.] qcow2_check_metadata_overlap > 4.51% [kernel] [k] fget_light > 3.98% [kernel] [k] _raw_spin_lock_irqsave > 2.55% [kernel] [k] fput > 2.48% libpthread-2.13.so [.] pthread_mutex_lock > 2.46% [kernel] [k] _raw_spin_unlock_irqrestore > 2.21% libpthread-2.13.so [.] __pthread_mutex_unlock_usercnt > 1.71% [vdso] [.] 0x000000000000060c > 1.68% libc-2.13.so [.] 0x00000000000e751f > 1.64% libglib-2.0.so.0.3200.4 [.] 0x000000000004fca0 > 1.20% [kernel] [k] __srcu_read_lock > 1.14% [kernel] [k] do_sys_poll > 0.96% [kernel] [k] _raw_spin_lock_irq > 0.95% [kernel] [k] __pollwait > 0.91% [kernel] [k] __srcu_read_unlock > 0.78% [kernel] [k] tcp_poll > 0.74% [kvm] [k] vcpu_enter_guest > 0.73% [kvm_intel] [k] vmx_vcpu_run > 0.72% [kernel] [k] _raw_spin_lock > 0.72% [kernel] [k] system_call_after_swapgs > 0.70% [kernel] [k] copy_user_enhanced_fast_string > 0.67% libglib-2.0.so.0.3200.4 [.] g_slice_free1 > 0.66% libpthread-2.13.so [.] 0x000000000000e12d > 0.65% [kernel] [k] system_call > 0.61% [kernel] [k] do_direct_IO > 0.57% qemu-system-x86_64 [.] qemu_iohandler_poll > 0.57% [kernel] [k] fsnotify > 0.54% libglib-2.0.so.0.3200.4 [.] g_slice_alloc > 0.50% [kernel] [k] vfs_write > 0.49% libpthread-2.13.so [.] pthread_getspecific > 0.48% qemu-system-x86_64 [.] qemu_event_reset > 0.47% libglib-2.0.so.0.3200.4 [.] g_main_context_check > 0.46% qemu-system-x86_64 [.] address_space_translate_internal > 0.46% [kernel] [k] sock_poll > 0.46% libpthread-2.13.so [.] __pthread_disable_asynccancel > 0.44% [kernel] [k] resched_task > 0.43% libpthread-2.13.so [.] __pthread_enable_asynccancel > 0.42% qemu-system-x86_64 [.] phys_page_find > 0.39% qemu-system-x86_64 [.] object_dynamic_cast_assert Max: Unrelated to this performance issue but I notice that the qcow2 metadata overlap check is high in the host CPU profile. Have you had any thoughts about optimizing the check? Stefan
Attachment:
pgp8rAmW9cGIe.pgp
Description: PGP signature