Re: Sorting out terminology

Bjorn Helgaas <bhelgaas@xxxxxxxxxx> · Thu, 13 Nov 2014 11:30:13 -0700

On Thu, Nov 13, 2014 at 11:04 AM, Jake Oshins <jakeo@xxxxxxxxxxxxx> wrote:
>>> I can see how to create a root PCI bus, and I can see how to do that in
>>> response to events coming in through paravirtual channels.  What's not clear
>>> to me is whether there is a way to tear down a root PCI bus after it has been
>>> created.  The most natural implementation for me would be to create a root
>>> PCI bus whenever a PCI device is offered to the guest VM, tearing down that
>>> root bus when the device is removed from the VM.  Would I be better off
>>> creating one "fake" root PCI bus that lives forever and placing functions
>>> beneath that?  (This seems more complicated to me, unless there's no way
>>> to tear down a root.)
>>
>> I'd start by looking at acpi_pci_root_remove().  Theoretically that
>> removes a PCI host bridge.  I say "theoretically" because this is new
>> code and I haven't seen any bug reports related to it.  Maybe that
>> means it works perfectly, or maybe it means nobody is really using it
>> :)  Yinghai Lu has done most of the work in this area.
>
> Thanks for this advice.  That function was very informative.  I didn't really explain myself properly, though, and so I need to follow up here.  The root PCI bus that I would like to expose is not driven by ACPI.  The only role that ACPI takes in this VM (at least relevant to this discussion) is to create a Module Device (_CID of "ACPI0004") which has a _CRS that exposes all the address space available for all types of I/O devices, both paravirtual and PCI pass-through.  The paravirtual frame buffer, for instance, allocates from this range, and that isn't exposed in any way that looks like a PCI bus.

Makes sense.  I didn't think acpi_pci_root_remove() would be a
fully-formed solution for you; I
just pointed you there because it does things similar to what you want to do.

> When we do SR-IOV for Windows guests, we just create a root PCI bus and place a function underneath the root bus.  No root complex or anything other than the function itself is exposed in the guest.  The paravirtualization is purely a protocol, in contrast to the strategies used with libvirt, Xenbus and others.  There is no ACPI node for the root PCI buses.  There is no emulator for the host bridge, etc.
>
> I'm curious whether this strategy can work for Linux guests.  It seems like my paravirtual PCI front-end could just call the same functions that are invoked in acpi_pci_root_remove().  The only impediment seems to be that pci_stop_root_bus() and pci_remove_root_bus() are not marked with EXPORT_SYMBOL_GPL ().
>
> Would it be acceptable to mark these with EXPORT_SYMBOL_GPL()?  Is this not done simply because nobody has needed it before?  Or is there some other reason?

They're not exported simply because we haven't had any need to export
them.  I think they could be exported if necessary.

>> You said "root port's bridge windows," and I think we can change PCIe
>> Root Port bridge windows in the same way we can change any other
>> PCI-to-PCI bridge's windows.  This is another area that works in
>> theory but is not well-exercised, and here we *do* have bug reports,
>> so I know it doesn't work perfectly.  Yinghai Lu is also the expert in
>> this sort of resource allocation.
>>
>> But it sounds like you might be talking about changing the host
>> bridge's windows, i.e., using _SRS on the host bridge ACPI device, and
>> Linux doesn't have any support for that.
>>
>> Since you mentioned terminology, here's how I use it:
>>
>>   host bridge = a PNP0A03 or PNP0A08 device
>>   root bus = the PCI/PCIe bus immediately below a host bridge (bus
>> number is the _MIN of the bus number range in the host bridge _CRS)
>>   root port = a PCIe Root Port, which we handle as a regular
>> PCI-to-PCI bridge (plus the PCIe services)
>>
>> Bjorn
>
> Similarly, I'm not talking about invoking _SRS on a host bridge.  (I know where the pitfalls are there, and why doing it is a bad idea.  I happen to have implemented that code in Windows.)  I'm talking about changing the resources assigned to a purely paravirtual root PCI bus.  I assume that the protocol for changing the size of PCI-to-PCI bridge windows could be used for this, too.  Do you have a pointer on where to read the code to understand this process?

The existing resource reassignment code is mostly in
drivers/pci/setup-bus.c.  It's possible that if you change the
resources assigned to the paravirtual root bus, it will do the right
thing.  It's not exactly a case we have today, but it *is* similar to
the case of a device below a P2P bridge where we changed the window
size.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html