Re: hdd kills vm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 24, 2023 at 04:28:58PM +0200, Martin Kletzander wrote:
On Mon, Oct 23, 2023 at 04:59:08PM +0200, daggs wrote:
Greetings Martin,

Sent: Sunday, October 22, 2023 at 12:37 PM
From: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
To: "daggs" <daggs@xxxxxxx>
Cc: libvir-list@xxxxxxxxxx
Subject: Re: hdd kills vm

On Fri, Oct 20, 2023 at 02:42:38PM +0200, daggs wrote:
>Greetings,
>
>I have a windows 11 vm running on my Gentoo using libvirt (9.8.0) + qemu (8.1.2), I'm passing almost all available resources to the vm
>(all 16 cpus, 31 out of 32 GB, nVidia gpu is pt), but the performance is not good, system lags, takes long time to boot.

There are couple of things that stand out to me in your setup and I'll
assume the host has one NUMA node with 8 cores, each with 2 threads as,
just like you set it up in the guest XML.
thats correct, see:
$ lscpu | grep -i numa
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-15

however:
$ dmesg | grep -i numa
[    0.003783] No NUMA configuration found

can that be the reason?


no, this is fine, 1 NUMA node is not a NUMA, technically, so that's
perfectly fine.


* When you give the guest all the CPUs the host has there is nothing
   left to run the host tasks.  You might think that there "isn't
   anything running", but there is, if only your init system, the kernel
   and the QEMU which is emulating the guest.  This is definitely one of
   the bottlenecks.
I've tried with 12 out of 16, same behavior.


* The pinning of vCPUs to CPUs is half-suspicious.  If you are trying to
   make vCPU 0 and 1 be threads on the same core and on the host the
   threads are represented as CPUs 0 and 8, then that's fine.  If that is
   just copy-pasted from somewhere, then it might not reflect the current
   situation and can be source of many scheduling issues (even once the
   above is dealt with).
I found a site that does it for you, if it is wrong, can you point me to a place I can read about it?


Just check what the topology is on the host and try to match it with the
guest one.  If in doubt, then try it without the pinning.


* I also seem to recall that Windows had some issues with systems that
   have too many cores.  I'm not sure whether that was an issue with an
   edition difference or just with some older versions, or if it just did
   not show up in the task manager, but there was something that was
   fixed by using either more sockets or cores in the topology.  This is
   probably not the issue for you though.

>after trying a few ways to fix it, I've concluded that the issue might be related to the why the hdd is defined at the vm level.
>here is the xml: https://bpa.st/MYTA
>I assume that the hdd sits on the sata ctrl causing the issue but I'm not sure what is the proper way to fix it, any ideas?
>

It looks like your disk is on SATA, but I don't see why that would be an
issue. Passing the block device to QEMU as VirtIO shouldn't cause that
much of a difference.  Try measuring the speed of the disk on the host
and then in the VM maybe.  Is that SSD or NVMe?  I presume that's not
spinning rust, is it.
as seen, I have 3 drives, 2 cdroms as sata and one hdd pt as virtio, I read somewhere that if the controller of the virtio
device is sata, than it doesn't uses the virtio optimally.

Well it _might_ be slightly more beneficial to use virtio-scsi or even
<disk type='block' device='lun'>, but I can't imagine that would make
the system lag.  I'm not that familiar with the details.

it is a spindle, nvmes are too expensive where I live, frankly, I don't need lightning fast boot, the other BM machines running windows on spindle
run it quite fast and they aren't half as fast as this server


That might actually be related.  The guest might think it is a different
type of disk and use completely suboptimal scheduling.  This might
actually be solved by passing it as <disk device='lun'..., but at this
point I'm just guessing.


Also you probably want to use something like:

<target dev='sda' bus='scsi' rotation_rate='X'/>

and I have no idea whether matching the rotation_rate to the actual one
is beneficial, maybe skip that?


>Thanks,
>
>Dagg.
>




Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux