Re: [PATCH v2 1/2] PCI: Take multifunction devices into account when distributing resources

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 21, 2022 at 01:47:16PM +0200, Mika Westerberg wrote:
> On Fri, Nov 18, 2022 at 06:29:51AM -0600, Bjorn Helgaas wrote:
> > On Fri, Nov 18, 2022 at 10:57:12AM +0200, Mika Westerberg wrote:
> > > On Thu, Nov 17, 2022 at 05:10:34PM -0600, Bjorn Helgaas wrote:
> > > > On Mon, Nov 14, 2022 at 01:59:52PM +0200, Mika Westerberg wrote:
> > > > > PCIe switch upstream port may be one of the functions of a multifunction
> > > > > device.
> > > > 
> > > > I don't think this is specific to PCIe, is it?  Can't we have a
> > > > multi-function device where one function is a conventional PCI bridge?
> > > > Actually, I don't think "multi-function" is relevant at all -- you
> > > > iterate over all the devices on the bus below.  For PCIe, that likely
> > > > means multiple functions of the same device, but it could be separate
> > > > devices in conventional PCI.
> > > 
> > > Yes it can be but I was trying to explain the problem we encountered and
> > > that's related to PCIe.
> > > 
> > > I can leave this out if you think it is better that way.
> > 
> > Not necessarily, I'm just hoping this change is generic enough for all
> > PCI and PCIe topologies.
> 
> Okay maybe I'll just drop the "multi-function" part of it?
> 
> > > > > The resource distribution code does not take this into account
> > > > > properly and therefore it expands the upstream port resource windows too
> > > > > much, not leaving space for the other functions (in the multifunction
> > > > > device)
> > > > 
> > > > I guess the window expansion here is done by adjust_bridge_window()?
> > > 
> > > Yes but the resources are distributed in
> > > pci_bus_distribute_available_resources().
> > 
> > Yep, sounds good, I was just confirming my understanding of the code.
> > The main point of this patch is to *reduce* the size of the windows to
> > leave room for peers of the bridge, so I had to look a bit to figure
> > out where they got expanded.
> 
> Understood :)
> 
> > > > >  	if (hotplug_bridges + normal_bridges == 1) {
> > > > > -		dev = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > > > > -		if (dev->subordinate)
> > > > > -			pci_bus_distribute_available_resources(dev->subordinate,
> > > > > -				add_list, io, mmio, mmio_pref);
> > > > > +		/* Upstream port must be the first */
> > > > > +		bridge = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > > > > +		if (!bridge->subordinate)
> > > > > +			return;
> > > > > +
> > > > > +		/*
> > > > > +		 * It is possible to have switch upstream port as a part of a
> > > > > +		 * multifunction device. For this reason reduce the space
> > > > > +		 * available for distribution by the amount required by the
> > > > > +		 * peers of the upstream port.
> > > > > +		 */
> > > > > +		list_for_each_entry(dev, &bus->devices, bus_list) {
> > > > 
> > > > It seems like maybe we ought to do this regardless of how many bridges
> > > > there are on the bus.  Don't we always want to assign space to devices
> > > > on this bus before distributing the leftovers to downstream buses?
> > > 
> > > Yes we do.
> > > 
> > > > E.g., maybe this should be done before the adjust_bridge_window()
> > > > calls?
> > > 
> > > With the current code it is clear that we deal with the upstream port.
> > > At least in PCIe it is not allowed to have anything else than downstream
> > > ports on that internal bus so the only case we would need to do this is
> > > the switch upstream port.
> > > 
> > > Let me know if you still want me to move this before adjust_bridge_window()
> > > I can do that in v3. Probably needs a comment too.
> > 
> > Hmm, I don't know exactly how to do this, but I don't think this code
> > should be PCIe-specific, which I think means it shouldn't depend on
> > how many bridges are on the bus.
> > 
> > I guess the existing assumption that a bridge must be the first device
> > on the bus is a hidden assumption that this is PCIe.  That might be a
> > mistake from the past.
> > 
> > I haven't tried it, but I wonder if we could reproduce the same
> > problem in a conventional PCI topology with qemu.
> 
> I'm not an expert in QEMU but I think I was able to create such topology
> with the following command line (parts copied from Jonathan's):
> 
> qemu-system-aarch64                                                             \
>         -M virt,nvdimm=on,gic-version=3 -m 4g,maxmem=8G,slots=8 -cpu max -smp 4 \
>         -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd                           \
>         -nographic -no-reboot                                                   \
>         -kernel Image                          					\
>         -initrd rootfs.cpio.bz2              					\
>         -device pcie-pci-bridge,id=br1                                          \
>         -device pci-bridge,chassis_nr=1,bus=br1,id=br2,shpc=on,addr=2           \
>         -device e1000,bus=br1,addr=3                                            \
>         -device e1000,bus=br1,addr=4                                            \
>         -device e1000,bus=br2,addr=5
> 
> However, the problem does not reproduce with or without the patch. The
> below is 'lspci -v' without this patch and 5632e2beaf9d5dda694c0572684dea783d8a9492 reverted:

IIUC, the summary is this:

  00:02.0 bridge window [mem 0x10000000-0x102fffff] to [bus 01-02]
  01:02.0 bridge window [mem 0x10000000-0x100fffff] to [bus 02]
  01:03.0 NIC BAR [mem 0x10200000-0x1021ffff]
  01:04.0 NIC BAR [mem 0x10220000-0x1023ffff]
  02:05.0 NIC BAR [mem 0x10080000-0x1009ffff]

and it's the same with and without the current patch.

Are all these assignments done by BIOS, or did Linux update them?
Did we exercise the same "distribute available resources" path as in
the PCIe case?  I expect we *should*, because there really shouldn't
be any PCI vs PCIe differences in how resources are handled.  This is
why I'm not comfortable with assumptions here that depend on PCIe.

I can't tell from Jonathan's PCIe case whether we got a working config
from BIOS or not because our logging of bridge windows is kind of
poor.

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux