Re: [PATCH v2 1/2] PCI: Take multifunction devices into account when distributing resources

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Fri, Nov 18, 2022 at 06:29:51AM -0600, Bjorn Helgaas wrote:
> Hi Mika,
> 
> On Fri, Nov 18, 2022 at 10:57:12AM +0200, Mika Westerberg wrote:
> > On Thu, Nov 17, 2022 at 05:10:34PM -0600, Bjorn Helgaas wrote:
> > > On Mon, Nov 14, 2022 at 01:59:52PM +0200, Mika Westerberg wrote:
> > > > PCIe switch upstream port may be one of the functions of a multifunction
> > > > device.
> > > 
> > > I don't think this is specific to PCIe, is it?  Can't we have a
> > > multi-function device where one function is a conventional PCI bridge?
> > > Actually, I don't think "multi-function" is relevant at all -- you
> > > iterate over all the devices on the bus below.  For PCIe, that likely
> > > means multiple functions of the same device, but it could be separate
> > > devices in conventional PCI.
> > 
> > Yes it can be but I was trying to explain the problem we encountered and
> > that's related to PCIe.
> > 
> > I can leave this out if you think it is better that way.
> 
> Not necessarily, I'm just hoping this change is generic enough for all
> PCI and PCIe topologies.

Okay maybe I'll just drop the "multi-function" part of it?

> > > > The resource distribution code does not take this into account
> > > > properly and therefore it expands the upstream port resource windows too
> > > > much, not leaving space for the other functions (in the multifunction
> > > > device)
> > > 
> > > I guess the window expansion here is done by adjust_bridge_window()?
> > 
> > Yes but the resources are distributed in
> > pci_bus_distribute_available_resources().
> 
> Yep, sounds good, I was just confirming my understanding of the code.
> The main point of this patch is to *reduce* the size of the windows to
> leave room for peers of the bridge, so I had to look a bit to figure
> out where they got expanded.

Understood :)

> > > >  	if (hotplug_bridges + normal_bridges == 1) {
> > > > -		dev = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > > > -		if (dev->subordinate)
> > > > -			pci_bus_distribute_available_resources(dev->subordinate,
> > > > -				add_list, io, mmio, mmio_pref);
> > > > +		/* Upstream port must be the first */
> > > > +		bridge = list_first_entry(&bus->devices, struct pci_dev, bus_list);
> > > > +		if (!bridge->subordinate)
> > > > +			return;
> > > > +
> > > > +		/*
> > > > +		 * It is possible to have switch upstream port as a part of a
> > > > +		 * multifunction device. For this reason reduce the space
> > > > +		 * available for distribution by the amount required by the
> > > > +		 * peers of the upstream port.
> > > > +		 */
> > > > +		list_for_each_entry(dev, &bus->devices, bus_list) {
> > > 
> > > It seems like maybe we ought to do this regardless of how many bridges
> > > there are on the bus.  Don't we always want to assign space to devices
> > > on this bus before distributing the leftovers to downstream buses?
> > 
> > Yes we do.
> > 
> > > E.g., maybe this should be done before the adjust_bridge_window()
> > > calls?
> > 
> > With the current code it is clear that we deal with the upstream port.
> > At least in PCIe it is not allowed to have anything else than downstream
> > ports on that internal bus so the only case we would need to do this is
> > the switch upstream port.
> > 
> > Let me know if you still want me to move this before adjust_bridge_window()
> > I can do that in v3. Probably needs a comment too.
> 
> Hmm, I don't know exactly how to do this, but I don't think this code
> should be PCIe-specific, which I think means it shouldn't depend on
> how many bridges are on the bus.
> 
> I guess the existing assumption that a bridge must be the first device
> on the bus is a hidden assumption that this is PCIe.  That might be a
> mistake from the past.
> 
> I haven't tried it, but I wonder if we could reproduce the same
> problem in a conventional PCI topology with qemu.

I'm not an expert in QEMU but I think I was able to create such topology
with the following command line (parts copied from Jonathan's):

qemu-system-aarch64                                                             \
        -M virt,nvdimm=on,gic-version=3 -m 4g,maxmem=8G,slots=8 -cpu max -smp 4 \
        -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd                           \
        -nographic -no-reboot                                                   \
        -kernel Image                          					\
        -initrd rootfs.cpio.bz2              					\
        -device pcie-pci-bridge,id=br1                                          \
        -device pci-bridge,chassis_nr=1,bus=br1,id=br2,shpc=on,addr=2           \
        -device e1000,bus=br1,addr=3                                            \
        -device e1000,bus=br1,addr=4                                            \
        -device e1000,bus=br2,addr=5

However, the problem does not reproduce with or without the patch. The
below is 'lspci -v' without this patch and 5632e2beaf9d5dda694c0572684dea783d8a9492 reverted:

00:02.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 47
        Memory at 8000004000 (64-bit, non-prefetchable) [size=256]
        Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
        I/O behind bridge: 1000-2fff [size=8K] [16-bit]
        Memory behind bridge: 10000000-102fffff [size=3M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: [8c] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [84] Power Management version 3
        Capabilities: [48] Express PCI-Express to PCI/PCI-X Bridge, MSI 00
        Capabilities: [40] Hot-plug capable
        Capabilities: [100] Advanced Error Reporting

01:02.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 45
        Memory at 10240000 (64-bit, non-prefetchable) [size=256]
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: 1000-1fff [size=4K] [16-bit]
        Memory behind bridge: 10000000-100fffff [size=1M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: [4c] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [48] Slot ID: 0 slots, First+, chassis 01
        Capabilities: [40] Hot-plug capable

01:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 46
        Memory at 10200000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 2000 [size=64]
        Expansion ROM at 10100000 [disabled] [size=512K]
        Kernel driver in use: e1000

01:04.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 47
        Memory at 10220000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 2040 [size=64]
        Expansion ROM at 10180000 [disabled] [size=512K]
        Kernel driver in use: e1000

02:05.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 46
        Memory at 10080000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 1000 [size=64]
        Expansion ROM at 10000000 [disabled] [size=512K]
        Kernel driver in use: e1000

And with this patch (5632e2beaf9d5dda694c0572684dea783d8a9492 still reverted):

00:02.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 47
        Memory at 8000004000 (64-bit, non-prefetchable) [size=256]
        Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
        I/O behind bridge: 1000-2fff [size=8K] [16-bit]
        Memory behind bridge: 10000000-102fffff [size=3M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: [8c] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [84] Power Management version 3
        Capabilities: [48] Express PCI-Express to PCI/PCI-X Bridge, MSI 00
        Capabilities: [40] Hot-plug capable
        Capabilities: [100] Advanced Error Reporting

01:02.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode])
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 45
        Memory at 10240000 (64-bit, non-prefetchable) [size=256]
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: 1000-1fff [size=4K] [16-bit]
        Memory behind bridge: 10000000-100fffff [size=1M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: [4c] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [48] Slot ID: 0 slots, First+, chassis 01
        Capabilities: [40] Hot-plug capable

01:03.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 46
        Memory at 10200000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 2000 [size=64]
        Expansion ROM at 10100000 [disabled] [size=512K]
        Kernel driver in use: e1000

01:04.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 47
        Memory at 10220000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 2040 [size=64]
        Expansion ROM at 10180000 [disabled] [size=512K]
        Kernel driver in use: e1000

02:05.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 03)
        Subsystem: Red Hat, Inc. QEMU Virtual Machine
        Flags: bus master, fast devsel, latency 0, IRQ 46
        Memory at 10080000 (32-bit, non-prefetchable) [size=128K]
        I/O ports at 1000 [size=64]
        Expansion ROM at 10000000 [disabled] [size=512K]
        Kernel driver in use: e1000



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux