On 8/20/21 12:07 PM, daggs wrote:
Greetings Laine,
Sent: Monday, August 16, 2021 at 12:57 AM
From: "Laine Stump" <laine@xxxxxxxxxx>
To: "daggs" <daggs@xxxxxxx>
Cc: "Martin Kletzander" <mkletzan@xxxxxxxxxx>, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
On 8/14/21 6:05 AM, daggs wrote:
Greetings Martin,
Sent: Thursday, August 12, 2021 at 2:07 PM
From: "daggs" <daggs@xxxxxxx>
To: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
Cc: dan@xxxxxxxxxxxx, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
Sent: Thursday, August 12, 2021 at 11:49 AM
From: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
To: "daggs" <daggs@xxxxxxx>
Cc: dan@xxxxxxxxxxxx, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
On Wed, Aug 11, 2021 at 08:53:10PM +0200, daggs wrote:
Greetings Martin,
Sent: Wednesday, August 11, 2021 at 6:08 PM
From: "daggs" <daggs@xxxxxxx>
To: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
Cc: dan@xxxxxxxxxxxx, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
Greetings Martin,
Sent: Wednesday, August 11, 2021 at 4:13 PM
From: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
To: "daggs" <daggs@xxxxxxx>
Cc: dan@xxxxxxxxxxxx, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:
Greetings Martin,
Sent: Wednesday, August 11, 2021 at 10:14 AM
From: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
To: "daggs" <daggs@xxxxxxx>
Cc: dan@xxxxxxxxxxxx, libvirt-users@xxxxxxxxxx
Subject: Re: issues with vm after upgrade
[...]
2) To your issue with starting the domain it would be good to know what
is the error you get from virsh (or however you are starting the
domain) and the debug logs of libvirtd, ideally just for the part of
the domain starting.
that is the issue, there wasn't any error. the vm just didn't booted.
Oh, so I misunderstood. What was the state of the VM in libvirt?
"paused" or "running"? Was there serial console working?
it was marked as running and there was no serial
That's a pity we could not examine what was actually happening.
I can diff the original xml with the new one to see the diffs and post them here if you wish
Would be nice to see if there are any differences. The newly created
one works then?
I'll sent it later today
here: https://dpaste.com/5VBUU8Z9W
Unfortunately there are many differences there. The machine type
changes _something_ in qemu, there is different PCI(e) topology, and I
do not think I will be able to figure this out without the non-working
machine.
So if your current setup works for you right now I'd leave figuring out
the previous issue to others, if there is anyone wanting to figure out
if there is some libvirt issue.
Have a nice day
my current setup works beside the hdmi audio, this I still need to investigate.
thanks for your help.
Dagg
just to update, I've solved the sound issue, frankly, I don't understand how the guest showed a soundcard in the first place.
from what I gather, libvirt sets the -nodefaults flag to prepare the vm's properties from scratch.
in this situation, the sound card is a function in the host machine's pci tree.
when libvirt created the pci tree for the guest, it placed the card as a function of a device as well, in my case 02:00.2
however it didn't created a device at 02:00.0.
Are you basing this claim on the libvirt XML? Or on what you see with
lspci in the guest?
When libvirt is assigning PCI addresses to devices in a guest, it will
never auto-assign a non-0 function. This will only happen if the user
explicitly requests it (and even then, iirc, libvirt should generate an
error if function 0 of the same slot has no device - something to the
effect of "no device on function 0 of a multifunction device").
Anyway, when I looked back at the XML diff you posted earlier (see
below), I didn't see any hostdev device assigned to 02:00.2. What I
*did* see was that in both the old and the new version of the diff, the
hostdev devices were assigned to function 0 of different *slots* on a
dmi-to-pci-bridge controller, which should cause no problems (unless
there is a bug in QEMU's dmi-to-pci-bridge). (The important thing,
though, is that there is no hostdev device on a non-0 function, and when
it is on a non-0 slot, that's because it's on a dmi-to-pci-bridge (which
has 32 slots).
I saw it in guest,
But I didn't see it in the XML diffs that you had posted.
I'd assume that if libvirt defines a device on a specific bdf, the guest will not change it.
That's not exactly true - the bus "number" in libvirt isn't given to
qemu as an actual number, but as an alphanumeric device id (called
"alias name" in libvirt XML). QEMU doesn't have any concept of "bus
number", because (afaiu) there is no way to convey such info to the
guest firmware/OS; instead, QEMU creates a topology of interconnected
controllers, the firmware and/or OS traverses this topology and assigns
numbers to the encountered controllers as it sees fit.
So you may have PCI controllers with indexes 1, 2, and 3 in your libvirt
config, but those will be described on the QEMU commandline as
controllers "pcie.1", "pcie.2", and "pcie.3":
-device
pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1
\
-device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
-device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
and when a PCI device is attached to one of these controllers, the QEMU
commandline uses the id name of the controller, not a bus number:
-device vfio-pci,host=0000:05:00.0,bus=pci.1,multifunction=on,addr=0x0 \
-device vfio-pci,host=0000:05:00.1,id=hostdev1,bus=pci.1,addr=0x0.0x1 \
It is a nice coincidence that the OSes I've seen happen to traverse the
PCI topology in a manner that results in the guest OS numbering the
buses the same as they are numbered in libvirt XML that has had PCI
addresses auto-assigned by libvirt, but it is trivial to make this *not*
happen. For example, if you changed the config so that the bus with
index='2' (pcie.2) was attached to pcie.0, addr=0x1.0x1 (i.e. change its
PCI address to <address type='pci' bus='0' slot='1' function='1'/>" ,
and the bus with index='3' was attached to pcie.0, addr=0x1.0x2, then
the guest would number "pcie.2" as bus 1, and "pcie.1" as bus 2.
And of course the guest OS is free to traverse the controller topology
in any manner it wants, so the bus numbering in the guest could be
different even if the libvirt-generated QEMU commandline was the same.
*HOWEVER*, slot (device) and function number are specified on the QEMU
commandline numerically, and they will appear in the guest exactly as
they are in the libvirt XML.
infact, over the last 10 years I've booted thousand of systems both bare metal and visualized and never encountered such scenario.
that said, it might be a bug in qemu. >
what I did saw is that on the old vm in guest, after the upgrade the sound card was defined as a function of the scsi virtblk controller and the new vm placed
it as a function of non existent device.
I would be very interested in seeing the libvirt XML, QEMU commandline,
and guest-side output of "lspci" for this. I can't think of any way this
could happen without a serious bug *somewhere* (or manual intervention
in the PCI addresses in the guest).
On the topic of having a dmi-to-pci-bridge show up in your XML: I don't
remember what versions the changes were in (it was at least a year or
two ago), but only a fairly old version of libvirt woud do that - 1)
recent libvirt will assume that any hostdev PCI device is a PCIe device,
so it will add a pcie-root-port and assign the hostdev device to slot 0
of that root-port, and even before that 2) we switched from using
dmi-to-pci-bridge to using pcie-to-pci-bridge quite some time ago as well.
as stated in the original mail, the issue started after a major version upgrade of both libvirt and qemu,
I'm currently using latest stable afaik.
Right. If your guest was defined the first time using a much older
libvirt, then devices would have been assigned to an auto-created
dmi-to-pci-bridge at that time, and if you don't change (or remove) the
PCI addresses of the devices or the bridge, then that will all be
maintained whenever you restart the guest, ragardless of libvirt
upgrades. But this again points out that the guest-side PCI addresses
(which are determined by the PCI addresses in the libvirt config) should
not change when upgrading libvirt (NOTE: 1) libvirt will only
auto-assign a new PCI address to a device if it doesn't already have a
PCI address assigned to it, and 2) libvirt *never* auto-assigns a non-0
function except when adding a pcie-root-port (and in that case it will
always first assign something to function 0))
So if you're generating new XML based on config that doesn't have pci
controllers already in it, and you're seeing hostdevs (or any other PCI
devices) assigned to an automatically-added dmi-to-pci-bridge, then your
libvirt version is severely out of date.
here are the version I'm using:
# emerge --search app-emulation/libvirt app-emulation/qemu
[ Results for search key : app-emulation/libvirt ]
Searching...
* app-emulation/libvirt
Latest version available: 7.5.0
Latest version installed: 7.5.0
Size of files: 9749 KiB
Homepage: https://www.libvirt.org/ https://gitlab.com/libvirt/libvirt/
Description: C toolkit to manipulate virtual machines
License: LGPL-2.1
[ Applications found : 1 ]
[ Results for search key : app-emulation/qemu ]
Searching...
* app-emulation/qemu
Latest version available: 6.0.0-r52
Latest version installed: 6.0.0-r52
Size of files: 22724 KiB
Homepage: http://www.qemu.org http://www.linux-kvm.org
Description: QEMU + Kernel-based Virtual Machine userland tools
License: GPL-2 LGPL-2 BSD-2
[ Applications found : 1 ]
On 8/11/21 2:53 PM, daggs wrote:
>> From: "daggs" <daggs@xxxxxxx>
>>> From: "Martin Kletzander" <mkletzan@xxxxxxxxxx>
>>> On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:
>>>> I can diff the original xml with the new one to see the diffs and
post them here if you wish
>>>>
>>>
>>> Would be nice to see if there are any differences. The newly created
>>> one works then?
>>
>> I'll sent it later today
>>
>
> here: https://dpaste.com/5VBUU8Z9W
my fix was to move the device to 00:1f.4 in the guest.
That's an interesting choice :-). You could have just put it on function
0 of some other unused slot (or a non-0 function of the slot the GPU is
assigned to). 00:1f is used for integrated devices on the Q35 chipset -
it's nice that QEMU's emulation code was written to allowing adding more
devices on that slot, but I wouldn't have been surprised if it had
caused problems...
10 years of working in a virtualization company has taught me that somethings, keeping the pci structure close as much as possible
to the original is the best way to go.
that is why I chose it a s func, it is a func on the host mahcine.
It wasn't a function of a slot that also contains integrated chipset
devices though.... Oh, wait. According to the XML you reference down
below, it looks like the audio device you're assigning to the guest *is
itself* integrated on the chipset of the host, is that right?
(It's interesting that this function of slot 1F is apparently in a
different IOMMU group than the other functions of slot 1F. I would have
guessed they would all be in the same IOMMU group, resulting in an
inability to assign this one function to the guest without at least
disabling the other devices on slot 1F (by binding them to the vfio-pci
driver)
I won't be surprised this was the issue why the vm didn't booted after the upgrade with the old xml.
Well, if your XML had a device assigned to a non-0 function of a slot
and no device in function 0 of that slot, it would have failed to work
previously as well (my recollection is that in this case it's more a
problem of the guest OS not probing non-0 functions when there is
nothing on function 0, and not with anything done by QEMU).
here is the xml of the machine after I've recreated it, it worked but no sound: https://dpaste.com/BB9EDY6BK
I used virt-manager. note that the sound card pt is placed as a func in bus 0x8 which doesn't exists.
This doesn't show any devices assigned to non-0 functions in the guest
(which is the part of what you said in previous messages that sounded
wrong to me). (except for the SATA controller, which is listed in the
libvirt config only for informational purposes, as it is hardcoded into
the basic q35 virtual machine and can't be removed).
What is does show is that there is a device a 00:1F.3 *on the host* that
is being assigned to 08:01.00 (slot 1, function 0 of the
pcie-pci-bridge) in the guest. I'm guessing this is the audio device?
Also in this version of the XML, there is no longer a dmi-to-pci-bridge,
but there is instead a pcie-to-pci-bridge, implying that you've
redefined the guest config, resulting in PCI address auto-assignment
being re-run (at least relative to the config you referenced last week
that had a dmi-to-pci-bridge).
It's possible that the audio device's driver just doesn't like the
device being on a standard PCI (i.e. non-PCIe) slot in the guest
somehow, since it's a chipset-integrated PCIe device on the host. I
haven't heard of that being the case in the past, but it's possible.
Anyway, at this point I've lost track of all the changes that have
happened (your update entailed much more than just updating the libvirt
package - your guest config was also changed/redefined) so I don't know
how much more effort should be expended with post-mortem, especially
since you now have it working. One thing that I would note is that we
should probably be auto-assigning integrated chipset devices to
pcie-root-ports rather than to a pci-bridge (I thought we already did
that, but I can see how we might not).