Re: [PATCH 4/4] qemu: don't be as insistent about adding dmi-to-pci-bridge or pci-bridge

Marcel Apfelbaum <marcel@xxxxxxxxxx> · Mon, 25 Apr 2016 22:48:42 +0300

On 04/25/2016 07:09 PM, Laine Stump wrote:
On 04/25/2016 10:53 AM, Marcel Apfelbaum wrote:
On 04/25/2016 05:28 PM, Laine Stump wrote:
On 04/23/2016 12:46 PM, Cole Robinson wrote:
On 04/21/2016 02:48 PM, Laine Stump wrote:
Previously there was no way to have a Q35 domain that didn't have
these two controllers. This patch skips their creation as long as
there are some other kinds of pci controllers at index 1 and 2
(e.g. some pcie-root-port controllers).

I'm hoping that soon we won't add them at all, plugging all devices
into auto-added pcie-*-port ports instead, but in the meantime this
makes it easier to experiment with alternative bus hierarchies.
---
  src/qemu/qemu_domain.c | 18 +++++++++++-------
  1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 86b7d13..0b342e2 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -1581,14 +1581,18 @@ qemuDomainDefAddDefaultDevices(virDomainDefPtr def,
VIR_DOMAIN_CONTROLLER_MODEL_PCIE_ROOT)) {
              goto cleanup;
          }
-        if (virDomainDefMaybeAddController(
-               def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1,
- VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE) < 0 ||
-            virDomainDefMaybeAddController(
-               def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2,
-               VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE) < 0) {
+        /* add a dmi-to-pci-bridge and a pci-bridge if there are no pci controllers
+         * other than the pcie-root. This is so that there will be hot-pluggable
+         * PCI slots available
+         */
+        if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1) < 0 &&
+            !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 1,
+ VIR_DOMAIN_CONTROLLER_MODEL_DMI_TO_PCI_BRIDGE))
+            goto cleanup;
+        if (virDomainControllerFind(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2) < 0 &&
+            !virDomainDefAddController(def, VIR_DOMAIN_CONTROLLER_TYPE_PCI, 2,
+ VIR_DOMAIN_CONTROLLER_MODEL_PCI_BRIDGE))
              goto cleanup;
-        }
      }
      if (addDefaultMemballoon && !def->memballoon) {

Sounds like another qemuxml2xml test case candidate

... and it turns out this patch doesn't go far enough to have any useful effect. The problem is that the code that automatically assigns PCI addresses does its best to always be sure that there is at
least one empty hot-pluggable "standard PCI" slot on the system, so even if we don't explicitly add any pci-bridge to a Q35 domain, the address-assignment code will automatically add one anyway; and
since the pci-bridge requires a standard PCI slot itself, we still need a dmi-to-pci-bridge anyway (which provides 32 non-hotpluggable standard PCI slots).

I had been aiming to re-do how libvirt sets up the PCI controllers for Q35, eventually eliminating the current dmi-to-pci-bridge + pci-bridge in favor of a pure PCIe setup, using pcie-root-ports and
pcie-switch-(up|down)stream-ports, and this was going to be the first step towards that. But after discussing that idea more with Alex Williamson on Friday, I think we *shouldn't* do it, but just
leave the PCI controller setup essentially as it is (one possible change - allow connecting pci-bridge directly into a pcie-root port, thus eliminating the dmi-to-pci-bridge)


Connecting the pci-bridge directly into a pcie-root port is a little odd, but I suppose it would work.
Since I am late to the party, can you please explain why do you want to eliminate  the dmi-to-pci controller ? Because it doesn't support hotplug?

I guess just because it's an extra device that we use only for the purpose of connecting the pci-bridge. But definitely the entire reason for putting in a pci-bridge is to have non-Express slots that
support hotplug.

Understood.


I have some bad news in that direction. Even the pci-bridge (devices behind it) does not support hotplug on Q35, it is on my todo list.

I now remember somebody telling me that within the last several weeks (possibly you?)


Possibly me. I remember talking about it.

So if hotplug doesn't work for pci-bridge slots on Q35, then we currently wouldn't be losing anything if we just used the dmi-to-pci-bridge slots directly (in either case, hotplug won't work).


Correct.

Alex had suggested maybe the dmi-to-pci-bridge could be enhanced to support hotplug, or possibly a similar but generic controller could be added that supported hotplug. What is the difficulty of that
vs. fixing hotplug support on pci-bridge? (getting both would be best, of course). Would it be better to make the slots of the current dmi-to-pci-bridge (i82801b11-bridge) hotpluggable? Or to create a
new device?


I'll speak with Michael about it (I cc-ed him to this mail thread), but I think dmi-to-pci-bridge supporting hot-plug
would be the better way in my opinion. But only in case the real i82801b11-bridge supports hotplug.
Otherwise a generic dmi-to-pci bridge would be preferable. Regarding the amount of work, I can't say before digging a little deeper.


What prompted the idea to change to pure PCIe controllers was that qemu allows any PCI device to be plugged into a PCIe slot and it apparently functions just fine, and there had been occasional
complaints that plugging everything into a pci-bridge was somehow problematic (at one point someone suggested that virtio-net didn't work correctly on ARM if plugged into a non-Express slot, but I
think that was later disproved).

The problem with PCIe-only controllers, as Alex pointed out, is that when a non-Express device is plugged into a PCIe slot, the guest OS will see an apparently PCIe device that has no PCIe
capabilities, and while so far this hasn't caused any problem, there is no guarantee that it won't - PCIe devices are supposed to have PCIe capabilities. Since the only emulated qemu device that does
this is the NEC UHCI USB3 controller, it sees that we'll need to keep setting up the pci-bridge and plugging all the rest of the devices in there.

Actually all virtio devices are now PCI Express devices if virtio-1 is enabled. The QEMU command-line is -device virtio-<dev>-pci,disable-modern=false.

So that defaults to true?

Yes.

 How does this relate to the experimental x-disable-pcie setting?

PCIe is a prerequisite to virtio-1 (modern)


 Are they the same thing?

No. A device can be PCIe but not supporting virtio-1.

 What happens if you set this and then plug it into a non-Express slot?

It will remain a PCI device (no express capability)

Does setting
this flag change anything else for the device that would, e.g., require a different guest driver or other changes to the qemu commandline?


Not really

It sounds like I can use this - just check for disable-modern on each virtio device when getting qemu capabilities, then prefer a PCIe slot for any unaddressed device that has it.


This is definitely a good idea. Just be sure to connect it to a root port or to a switch downstream port and not to pcie.0 bus (not an integrated end point)


The only requirement is that the virtio device would not be connected directly to pcie.0 root bus, but to a pcie root port or switch.

So they have to be connected to a *-port even if you don't care about hotplug?

Well, let's say that is preferable to connect them to a root/downstream port if they have disable-modern=false. (you don't have to)

=======================================
By the way, even if is not directly related, another property you should look for is "disable-legacy".
If is "on" for all devices behind a switch or for the device behind a root port, no IO space would be reserved for them.
This is important because otherwise you can have max ~ 15 switches/root ports in the system. (The same as for pci-bridges in a PC machine)
The problem here is that the flag does affect the guest, if the guest drivers support only legacy devices you have a problem.
By the way, how can you know in advance what kind of drivers the guest has?
And again, this is another discussion for another time, but I wanted you to know at least the existence of this flag.
=====================================

 The entire deal with the ports on the root complex are confusing to me - Alex had said the other day that it *does*
happen that non-Express devices end up connected directly to the root complex in real hardware (although probably this violates the spec), so it may be reasonable to expect guest OSes to deal with
that (that, and dgilbert's success doing it, are what gave me the courage to suggest plugging pci-bridge directly into a root complex port).

OK, so the PCIe spec does treat Integrated Root Points (devices on bus pcie.0) differently by allowing the devices that are "inside/part of"
the Root Complex to be legacy devices (PCI). That means that attaching the PCI bridge directly to the root complex it should be a valid
(but odd) configuration as opposed to attaching the PCI bridge to a root port/downstream-port which is clearly a spec violation.





Still, it would be nice to allow *those who really want to* to have a "pure PCIe" controller setup. We could do that if we did two things:

Sure, pure PCIe setup would be nice.


1) modify the PCI address auto-assignment code to not insist on always having at least one hot-pluggable standard PCI slot available.

and one of the following:

2a) don't always add a dmi-to-pci-bridge, but instead only do so *if necessary* in order to plug in a pci-bridge (which would only be added if necessary).

  or

2b) permit auto-assigning a pci-bridge to plug into a port of pcie-root, thus eliminating the need for dmi-to-pci-bridge completely.

Does anyone have opinions about (1) or (2b)?

I personally don't like 2b because would not work in 'real' world, so 1 and 2a seems fine to me.

Bottom line, maybe we can tackle the dmi-to-pci bridge to cover your requirements.


I guess you mean "make the dmi-to-pci-bridge slots hotpluggable" by this, right?

Yes, it seems that this is the direction, I am really hoping that the real hardware spec allows this. Otherwise we need a generic dmi-to-pci bridge,
or we could go the Frankenstein way again. :)


Thanks,
Marcel


--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list