On Tue, Nov 08, 2022 at 03:11:30PM -0600, Bjorn Helgaas wrote: > On Thu, Nov 03, 2022 at 12:32:53PM +0200, Mika Westerberg wrote: > > It is possible to have PCIe switch upstream port a multifunction device. > > I can't quite parse this. I guess the point is that a Switch Upstream > Port may be one of the functions of a multifunction device? Yes. > > The resource distribution code does not take this into account properly > > and therefore it expands the upstream port resource windows too much, > > not leaving space for the other functions (in the multifunction device) > > and this leads to an issue that Jonathan reported. He runs QEMU with > > the following topoology (QEMU parameters): > > > > -device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \ > > -device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \ > > -device e1000,bus=root_port13,addr=0.1 \ > > -device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \ > > -device e1000,bus=fun1 > > > > The first e1000 NIC here is another function in the switch upstream > > port. This leads to following errors: > > > > pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04] > > pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04] > > pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000] > > e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0] > > > > Fix this by taking into account the possible multifunction devices when > > uptream port resources are distributed. > > Can you include the link to Jonathan's report? Sure I will. > > Reported-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > Signed-off-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> > > --- > > Hi, > > > > This is the formal patch that resulted from the discussion here: > > > > https://lore.kernel.org/linux-pci/20220905080232.36087-5-mika.westerberg@xxxxxxxxxxxxxxx/T/#m724289d0ee0c1ae07628744c283116e60efaeaf1 > > > > Only change from that version is that we loop through all resources of > > the multifunction device. > > > > drivers/pci/setup-bus.c | 63 ++++++++++++++++++++++++++++++++++++++--- > > 1 file changed, 59 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > > index b4096598dbcb..c8787b187ee4 100644 > > --- a/drivers/pci/setup-bus.c > > +++ b/drivers/pci/setup-bus.c > > @@ -1830,10 +1830,65 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus, > > * bridges below. > > */ > > if (hotplug_bridges + normal_bridges == 1) { > > - dev = list_first_entry(&bus->devices, struct pci_dev, bus_list); > > - if (dev->subordinate) > > - pci_bus_distribute_available_resources(dev->subordinate, > > - add_list, io, mmio, mmio_pref); > > + /* Upstream port must be the first */ > > Do you have any citation or reasoning for this handy? We had this > assumption before, and it seems true that an Upstream Port must be > Function 0 because a variety of Link-related things have to be in > Function 0, e.g., ARI ASPM Control, ARI Clock PM, Autonomous Width > Disable, Flit Mode Disable, LTR Enable, OBFF Enable, etc. But those > are all pretty oblique. > > I guess it's better to have the comment than not, but is the sort of > assertion that makes one wonder why it is true. Unfortunately I was not able to find such reference from the PCIe spec :( > > + bridge = list_first_entry(&bus->devices, struct pci_dev, bus_list); > > + if (!bridge->subordinate) > > + return; > > + > > + /* > > + * It is possible to have switch upstream port as a part > > + * of a multifunction device. For this reason reduce the > > + * resources occupied by the other functions before > > + * distributing the rest. > > The space consumed by the peer functions of the Switch Upstream Port > is determined by their BAR sizes, so I don't think we actually reduce > that. > > I *think* the point here is to reduce the space available for > distribution by the amount required by the peers of the Switch > Upstream Port, right? I.e., "mmio" is the amount of space we have to > distribute, and before splitting it across devices on the secondary > bus, we need to save out the space required for peers on the primary > bus. Yes, I will update the comment accordingly. > > + */ > > + list_for_each_entry(dev, &bus->devices, bus_list) { > > + int i; > > + > > + if (dev == bridge) > > + continue; > > + > > + /* > > + * It should be multifunction but if not stop > > + * the distribution and bail out. > > + */ > > + if (!dev->multifunction) > > + return; > > Why do we bother with this? If there are multiple devices on the bus, > don't we want to consider them all, regardless of whether > dev->multifunction is set? It seems like a gratuitous check. Agreed, I will remove it. > > > + for (i = 0; i < PCI_NUM_RESOURCES; i++) { > > + const struct resource *dev_res = &dev->resource[i]; > > + resource_size_t dev_sz; > > + struct resource *b_res; > > + > > + if (dev_res->flags & IORESOURCE_IO) { > > + b_res = &io; > > + } else if (dev_res->flags & IORESOURCE_MEM) { > > + if (dev_res->flags & IORESOURCE_PREFETCH) > > + b_res = &mmio_pref; > > + else > > + b_res = &mmio; > > + } else { > > + continue; > > + } > > + > > + /* Size aligned to bridge window */ > > + align = pci_resource_alignment(bridge, b_res); > > + dev_sz = ALIGN(resource_size(dev_res), align); > > + > > + pci_dbg(dev, "%pR aligned to %llx\n", dev_res, > > %#llx to avoid confusion and match other output. OK.