On Wed, 2016-02-17 at 15:03 -0500, Laine Stump wrote: > On 01/28/2016 04:14 PM, Cole Robinson wrote: > > > > If a user manually specifies this XML snippet for aarch64 machvirt: > > > > <controller type='pci' index='0' model='pci-root'/> > As you've noted below, this isn't correct. aarch64 machvirt has no > implicit pci-root controller (aka "pci.0"). It instead has a pcie-root > controller ("pcie.0"). Since a pci[e]-root controller cannot be > explicitly added, by definition this couldn't work. > > > > > > > Libvirt will interpret this to mean that the OS supports virtio-pci, > > and will allocate PCI addresses (instead of virtio-mmio) for virtio > > devices. > > > > This is a giant hack. Trying to improve it led me into the maze of PCI > > address code and I gave up for now. Here are the issues: > > > > * I'd prefer that to be model='pcie-root' which matches what > > qemu-system-aarch64 -M virt actually provides by default... however > > libvirt isn't happy with a single pcie-root specified by the user, it > > will error with: > > > > error: unsupported configuration: failed to create PCI bridge on bus 1: too many devices with fixed addresses > That's not the right error, but it's caused by the fact that libvirt > wants the pci-bridge device to be plugged into a standard PCI slot, but > all the slots of pcie-root are PCIe slots. Since we now know that qemu > doesn't mind if any standard PCI device is plugged into a PCIe slot, Should we rely on this behavior? Isn't this something that might change in the future? Or at least be quite puzzling for users? Just thinking out loud :) > the > decision of how we want to solve this problem depends on whether or not > we want the devices in question to be hot-pluggable - the ports of > pcie-root do not support hot-plugging devices (at least on Q35), while > the ports on pci-bridge do. So if we require that all devices be > hot-pluggable, then we have a few choices: > > 1) create the same PCI controller Frankenstein we currently have for Q35 > - a dmi-to-pci-bridge plugged into pcie-root, and a pci-bridge plugged > into dmi-to-pci-bridge. This is easiest because it already works, but it > does create an extra unnecessary controller. This is the current situation, right? qemu-kvm in current aarch64 RHEL doesn't have the i82801b11-bridge device compiled in, by the way. However, since qemu-system-aarch64 in Fedora 23 *does* have it, I assume enabling it would simply be a matter of flipping a build configuration bit. > 2) auto-add a pci-bridge in cases when there is a pcie-root but not > standard PCI slots. This would take only a slight amount more work. > > 3) auto-add a pcie-root-port to each port of the pcie-root controller. > This would still leave us with PCIe ports, so we would need to teach > libvirt that it's okay to plug PCI devices into PCIe ports. As mentioned above, I'm not sure this is a good idea. Maybe I'm just afraid of my own shadow though :) > If we don't require hot-pluggability, then we can just teach the > address-assignment code that PCI devices can plug into non-hotpluggable > PCIe ports and we're done. > > Or we can do a hybrid that's kind of a continuation of the "use PCI if > it's available, otherwise mmio" - we could do this: > > A) If there are any standard PCI slots, then auto-assign to PCI slots > (creating new pci-bridge controllers s necessary) > > B) else if there are any PCIe slots, then auto-assign to hot-pluggable > PCIe if available, or straight PCIe if not. > > C) else use virtio-mmio. > > ------------------------------------------- > > Mixed in with all of this discussion is my thinking that we should have > some way to specify, in XML, constraints for the address of each device > *without specifying the address itself*. Things we need to be able to > specify: > > 1) Is a PCI-only vs. PCIe-only vs. either one (maybe this could be used > in the future to constrain to virtio-mmio as well)? > > 2) Must the device be hot-pluggable? (default would be yes) > > 3) guest-side NUMA node? (I'm not sure if this needs to be user > specifiable - in the case of a vfio-assigned device, I think all we need > to to inform the guest which NUMA node the device is on in the host (via > putting it on a PXB controller that is configured with that same NUMA > node number). For emulated devices - is there any use to putting an > *emulated* device on the same controller as a particular vfio-assigned > device that is on a specific node? If not, then maybe it will never matter). > > It would be better if these "address constraints" were in a different > part of the XML than the <address> element itself - this would maintain > the simplicity of being able to just remove all <address> elements in > order to force libvirt to re-assign all device addresses. > > This isn't something that needs doing immediately, but worth keeping in > mind while putting together something that works for aarch64. > > > > > > > > > Instead this patch uses hacks to make pci-root use the pcie.0 bus for > > aarch64, since that code path already works. > I think that's a dead-end that we would have to back-track on, so > probably not a good solution even temporarily. > > > Here's an attempt at a plan: > > 1) change the PCI address assignment code so that for aarch64/virt it > prefers PCIe addresses, but still requires hot-pluggable (currently it > almost always prefers PCI, and requires hot-pluggable). (alternate - if > aarch64 doesn't support pcie-root-port or pcie-switch-*-port, then don't > require hot-pluggable either). > > 2) put something on the front of that that checks for existence of > pcie-root, and if it's not found, uses virtio-mmio instead (is there > something already that auto-adds the virtio-mmio address? I haven't > looked and am too lazy to do so now). > > At this point, as long as you manually add a bunch of pcie-root-port > controllers along with the manual pcie-root, everything should just > work. Then we would go to step 3: > > 3) enhance the auto-assign code so that, in addition to auto-adding a > pci-bridge when needed, it would auto-add either a single pcie-root-port > or a pcie-switch-upstream-port and 32 pcie-switch-downstream-ports > anytime a hotpluggable PCIe port was needed and couldn't be found. (the > latter assumes that aarch64 supports those controllers). > > Does that make any sense? I could try to code some of this up if you > could test it (or help me get setup to test it myself). I'm not sure I fully understand all of the above, but I'll pitch in with my own proposal regardless :) First, we make sure that <controller type='pci' index='0' model='pcie-root'/> is always added automatically to the domain XML when using the mach-virt machine type. Then, if <controller type='pci' index='1' model='dmi-to-pci-bridge'/> <controller type='pci' index='2' model='pci-bridge'/> is present as well we default to virtio-pci, otherwise we use the current default of virtio-mmio. This should allow management applications, based on knowledge about the guest OS, to easily pick between the two address schemes. Does this sound like a good idea? Cheers. -- Andrea Bolognani Software Engineer - Virtualization Team -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list