Re: [PATCH v3 1/2] PCI: Take other bus devices into account when distributing resources

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mika,

On Wed, Nov 30, 2022 at 01:22:20PM +0200, Mika Westerberg wrote:
> A PCI bridge may reside on a bus with other devices as well. The
> resource distribution code does not take this into account properly and
> therefore it expands the bridge resource windows too much, not leaving
> space for the other devices (or functions a multifunction device) and

functions *of* a 

> this leads to an issue that Jonathan reported. He runs QEMU with the
> following topoology (QEMU parameters):

topology

>  -device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2	\
>  -device x3130-upstream,id=sw1,bus=root_port13,multifunction=on	\
>  -device e1000,bus=root_port13,addr=0.1 			\
>  -device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3	\
>  -device e1000,bus=fun1

If you use spaces instead of tabs above, the "\" will stay lined up
when git log indents.

> The first e1000 NIC here is another function in the switch upstream
> port. This leads to following errors:
> 
>   pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
>   pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
>   pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
>   e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
> 
> Fix this by taking into account the possible multifunction devices when
> uptream port resources are distributed.

"upstream", although I think I would word this so it's less
PCIe-centric.  IIUC, we just want to account for all the BARs on the
bus, whether they're in bridges, peers in a multi-function device, or
other devices.

> Link: https://lore.kernel.org/linux-pci/20221014124553.0000696f@xxxxxxxxxx/
> Reported-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
> ---
>  drivers/pci/setup-bus.c | 66 ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 62 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index b4096598dbcb..d456175ddc4f 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -1830,10 +1830,68 @@ static void pci_bus_distribute_available_resources(struct pci_bus *bus,
>  	 * bridges below.
>  	 */
>  	if (hotplug_bridges + normal_bridges == 1) {
> -		dev = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> -		if (dev->subordinate)
> -			pci_bus_distribute_available_resources(dev->subordinate,
> -				add_list, io, mmio, mmio_pref);
> +		bridge = NULL;
> +
> +		/* Find the single bridge on this bus first */
> +		for_each_pci_bridge(dev, bus) {
> +			bridge = dev;
> +			break;
> +		}

If we just remember "bridge" in the loop before this hunk, could we
get rid of the loop here?  E.g.,

  bridge = NULL;
  for_each_pci_bridge(dev, bus) {
    bridge = dev;
    if (dev->is_hotplug_bridge)
      hotplug_bridges++;
    else
      normal_bridges++;
  }

> +
> +		if (WARN_ON_ONCE(!bridge))
> +			return;

Then I think this would be superfluous.

> +		if (!bridge->subordinate)
> +			return;
> +
> +		/*
> +		 * Reduce the space available for distribution by the
> +		 * amount required by the other devices on the same bus
> +		 * as this bridge.
> +		 */
> +		list_for_each_entry(dev, &bus->devices, bus_list) {
> +			int i;
> +
> +			if (dev == bridge)
> +				continue;

Why do we skip "bridge"?  Bridges are allowed to have two BARs
themselves, and it seems like they should be included here.

> +			for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +				const struct resource *dev_res = &dev->resource[i];
> +				resource_size_t dev_sz;
> +				struct resource *b_res;
> +
> +				if (dev_res->flags & IORESOURCE_IO) {
> +					b_res = &io;
> +				} else if (dev_res->flags & IORESOURCE_MEM) {
> +					if (dev_res->flags & IORESOURCE_PREFETCH)
> +						b_res = &mmio_pref;
> +					else
> +						b_res = &mmio;
> +				} else {
> +					continue;
> +				}
> +
> +				/* Size aligned to bridge window */
> +				align = pci_resource_alignment(bridge, b_res);
> +				dev_sz = ALIGN(resource_size(dev_res), align);
> +				if (!dev_sz)
> +					continue;
> +
> +				pci_dbg(dev, "resource %pR aligned to %#llx\n",
> +					dev_res, (unsigned long long)dev_sz);
> +
> +				if (dev_sz > resource_size(b_res))
> +					memset(b_res, 0, sizeof(*b_res));
> +				else
> +					b_res->end -= dev_sz;
> +
> +				pci_dbg(bridge, "updated available resources to %pR\n",
> +					b_res);
> +			}
> +		}

This only happens for buses with a single bridge.  Shouldn't it happen
regardless of how many bridges there are?

This block feels like something that could be split out to a separate
function.  It looks like it only needs "bus", "io", "mmio",
"mmio_pref", and maybe "bridge".

I don't understand the "bridge" part; it looks like that's basically
to use 4K alignment for I/O windows and 1M for memory windows?
Using "bridge" seems like a clunky way to figure that out.

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux