Re: Peer bridge fixup issue under multiple pci domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc EDAC folks, LKML]

On Sat, Aug 25, 2018 at 10:58:57PM +0800, Zihan Yang wrote:
> Hi all,
> 
> I'm trying to use multiple pci domain in qemu q35, but I find there
> might be some issues in peer bridge fixup.
> 
> In short, pcibios_fixup_peer_bridges function assumes only one pci
> domain (0) by default. This is OK when as qemu by default uses only
> one pci domain too. However, if I add another host bridge which is
> put into pci domain 1 by using _SEG, and a pcie_pci_bridge is attached
> to the bus 1 under this new pci domain 1 rather than domain 0, the
> kernel will recognize the bus 01 differently.
> 
> More specifically, pcibios_fixup_peer_bridges only reads all the buses
> under domain 0 but it can read the pci bus 01 in pci domain 1 and treat
> it as a peer bus of 0000:00. The consequence is this 01 bus is recognized
> as 0000:01, but it should have been recognized as 0001:01.
> 
> The host bus 0001:00 can be recognized so I guess pcibios_fixup_peer_bridges
> needs updating to take care of multiple domains? Or is it just an bios issue?
> I'm not quite sure and I'm open to any suggestions.

Is there something that actually does not work, or is this just a
concern that the code looks wrong?

pcibios_fixup_peer_bridges() is ancient history from before x86 used
the ACPI namespace to discover host bridges.  It blindly probes for
devices on buses 0-255, but as you say, only in domain 0.

Using multiple PCI domains really requires ACPI support so we know
what the other domains are (_SEG) and how to access their config space
(MCFG).  When we do have ACPI support in the platform and the kernel,
drivers/acpi/pci_root.c discovers all the host bridges in all domains
via PNP0A03 or PNP0A08 devices in the ACPI namespace, and in most
cases pcibios_fixup_peer_bridges() will do nothing.

However, there *are* systems where the firmware does not expose all
host bridges and in those cases, pcibios_fixup_peer_bridges() can be a
problem.  For example, Intel processors often have management devices
on bus 7f or ff.  If the ACPI namespace doesn't have a host bridge to
those buses, pci_root.c won't find them, but
pcibios_fixup_peer_bridges() *will*.

This leads to several problems.  Here's a dmesg sample from [1]
(found by googling for 'dmesg log "PCI: discovered peer bus ff"'):

  ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
  PCI: Discovered peer bus fe
  pci_bus 0000:fe: root bus resource [io  0x0000-0xffff]
  pci_bus 0000:fe: root bus resource [mem 0x00000000-0xffffffffff]
  pci 0000:fe:03.0: [8086:2d98] type 00 class 0x060000
  PCI: Discovered peer bus ff
  pci_bus 0000:ff: root bus resource [io  0x0000-0xffff]
  pci_bus 0000:ff: root bus resource [mem 0x00000000-0xffffffffff]
  pci 0000:ff:03.0: [8086:2d98] type 00 class 0x060000
  EDAC MC1: Giving out device to module i7core_edac.c controller i7 core #1: DEV 0000:fe:03.0 (INTERRUPT)
  EDAC PCI0: Giving out device to module i7core_edac controller EDAC PCI controller: DEV 0000:fe:03.0 (POLLED)
  EDAC MC0: Giving out device to module i7core_edac.c controller i7 core #0: DEV 0000:ff:03.0 (INTERRUPT)
  EDAC PCI1: Giving out device to module i7core_edac controller EDAC PCI controller: DEV 0000:ff:03.0 (POLLED)

Some of the problems are:

  - Firmware may have omitted the host bridges to [bus fe] and
    [bus ff] from the ACPI namespace because *it* is using those
    management devices, so EDAC blindly using them is a potential
    conflict.

  - pcibios_fixup_peer_bridges() only scans domain 0, so if this
    system had multiple domains, EDAC would only work on things in
    domain 0, ignoring other domains.

  - The PCI core can't do bus number assignment correctly for devices
    behind bridge PCI0.  The firmware told us [bus 00-ff] was
    available, so the core may assign bus number fe to some deep
    switch hierarchy.  But bus fe conflicts with the devices on the
    "peer bus fe".  This part is a firmware bug: it should have told
    us that PCI0 leads to [bus 00-fd], not [bus 00-ff].

  - The PCI core can't do resource assignment correctly for devices on
    [bus fe] and [bus ff].  It has no information about what MMIO and
    I/O port are routed to those buses, so it assumes *all* memory and
    I/O ports are routed there, which is clearly incorrect.  This part
    is a Linux bug; we really shouldn't be poking around for buses
    that ACPI didn't tell us about.

Bjorn

[1] https://bugs.freedesktop.org/attachment.cgi?id=136529



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux