> -----Original Messages----- > From: "Bjorn Helgaas" <helgaas@xxxxxxxxxx> > Sent Time: 2021-05-18 02:28:10 (Tuesday) > To: "Huacai Chen" <chenhuacai@xxxxxxxxx> > Cc: "Huacai Chen" <chenhuacai@xxxxxxxxxxx>, "Bjorn Helgaas" <bhelgaas@xxxxxxxxxx>, linux-pci <linux-pci@xxxxxxxxxxxxxxx>, "Jiaxun Yang" <jiaxun.yang@xxxxxxxxxxx>, "Jingfeng Sui" <suijingfeng@xxxxxxxxxxx> > Subject: Re: [PATCH 5/5] PCI: Support ASpeed VGA cards behind a misbehaving bridge > > On Mon, May 17, 2021 at 08:53:43PM +0800, Huacai Chen wrote: > > On Sat, May 15, 2021 at 5:09 PM Huacai Chen <chenhuacai@xxxxxxxxx> wrote: > > > On Fri, May 14, 2021 at 11:10 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > > On Fri, May 14, 2021 at 04:00:25PM +0800, Huacai Chen wrote: > > > > > According to PCI-to-PCI bridge spec, bit 3 of Bridge Control Register is > > > > > VGA Enable bit which modifies the response to VGA compatible addresses. > > > > > > > > The bridge spec is pretty old, and most of the content has been > > > > incorporated into the PCIe spec. I think you can cite "PCIe r5.0, sec > > > > 7.5.1.3.13" here instead. > > > > > > > > > If the VGA Enable bit is set, the bridge will decode and forward the > > > > > following accesses on the primary interface to the secondary interface. > > > > > > > > *Which* following accesses? The structure of English requires that if > > > > you say "the following accesses," you must continue by *listing* the > > > > accesses. > > > > > > > > > The ASpeed AST2500 hardward does not set the VGA Enable bit on its > > > > > bridge control register, which causes vgaarb subsystem don't think the > > > > > VGA card behind the bridge as a valid boot vga device. > > > > > > > > s/hardward/bridge/ > > > > s/vga/VGA/ (also in code comments and dmesg strings below) > > > > > > > > From the code, it looks like AST2500 ([1a03:2000]) is a VGA device, > > > > since it apparently has a VGA class code. But here you say the > > > > AST2500 has a Bridge Control register, which suggests that it's a > > > > bridge. If AST2500 is some sort of combination that includes both a > > > > bridge and a VGA device, please outline that topology. > > > > > > > > But the hardware defect is that some bridges forward VGA accesses even > > > > though their VGA Enable bit is not set? The quirk should be attached > > > > to broken *bridges*, not to VGA devices. > > > > > > > > If a bridge forwards VGA accesses regardless of how its VGA Enable bit > > > > is set, that means VGA arbitration (in vgaarb.c) cannot work > > > > correctly, so merely setting the default VGA device once in a quirk is > > > > not sufficient. You would have to somehow disable any future attempts > > > > to use other VGA devices. Only the VGA device below this defective > > > > bridge is usable. Any other VGA devices in the system would be > > > > useless. > > > > > > > > > So we provide a quirk to fix Xorg auto-detection. > > > > > > > > > > See similar bug: > > > > > > > > > > https://patchwork.kernel.org/project/linux-pci/patch/20170619023528.11532-1-dja@xxxxxxxxxx/ > > > > > > > > This patch was never merged. If we merged a revised version, please > > > > cite the SHA1 instead. > > > > > > This patch has never merged, and I found that it is unnecessary after > > > commit a37c0f48950b56f6ef2ee637 ("vgaarb: Select a default VGA device > > > even if there's no legacy VGA"). Maybe this ASpeed patch is also > > > unnecessary. If it is still needed, I'll investigate the root cause. > > > > I found that vga_arb_device_init() and pcibios_init() are both wrapped > > by subsys_initcall(), which means their sequence is unpredictable. And > > unfortunately, in our platform vga_arb_device_init() is called before > > pcibios_init(), which makes vga_arb_device_init() fail to set a > > default vga device. This is the root cause why we thought that we > > still need a quirk for AST2500. > > Does this mean there is no hardware defect here? The VGA Enable bit > works correctly? The AST2500 BMC card we are using consist of a PCI-to-PCI Bridge (1a03:1150) and a PCI VGA device (1a03:2000). The value of the Bridge Control register in its PCI-to-PCI Bridge Configuration Registers is 0x0000. Thus, the VGA Enable bit in the Bridge Control register do not set, and the VGA Enable bit is not writable. The topology of this BMC card is illustrated as following: /sys/devices/pci0000:00 |-- 0000:00:0c.0 | |-- class (0x060400) | |-- vendor (0x0014) | |-- device (0x7a09) | |-- ... | |-- 0000:04:00.0 | | | -- class (0x060400) | | | -- device (0x1150) | | | -- vendor (0x1a03) | | | -- revision (0x04) | | | -- ... | | | -- 0000:05:00.0 | | | | -- class (0x030000) | | | | -- device (0x2000) | | | | -- vendor (0x1a03) | | | | -- irq (51) | | | | -- i2c-6 | | | | -- drm | | | | -- graphics | | | | -- ... | `-- uevent `-- ... The following information is getted from lspci -vvxx: 04:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <tabort- <mabort-="">SERR- <perr- intx-="" latency:="" 0="" interrupt:="" pin="" a="" routed="" to="" irq="" 51="" numa="" node:="" bus:="" primary="04," secondary="05," subordinate="05," sec-latency="0" i="" o="" behind="" bridge:="" 00004000-00004fff="" memory="" 41000000-427fffff="" status:="" 66mhz+="" fastb2b-="" parerr-="" devsel="medium">TAbort- <tabort- <mabort-="" <serr-="" <perr-="" bridgectl:="" parity-="" serr-="" noisa-="" vga-="" mabort-="">Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: <access denied=""> 00: 03 1a 50 11 07 00 10 00 04 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 04 05 05 00 41 41 20 02 20: 00 41 70 42 f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 63 01 00 00 05:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41) (prog-if 00 [VGA controller]) Subsystem: ASPEED Technology, Inc. ASPEED Graphics Family Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+ ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <tabort- <mabort-="">SERR- <perr- intx-="" latency:="" 0="" interrupt:="" pin="" a="" routed="" to="" irq="" 51="" numa="" node:="" region="" 0:="" memory="" at="" 41000000="" (32-bit,="" non-prefetchable)="" [size="16M]" 1:="" 42000000="" 2:="" i="" o="" ports="" 4000="" capabilities:="" <access="" denied=""> Kernel driver in use: ast 00: 03 1a 00 20 27 00 10 02 41 00 00 03 00 00 00 00 10: 00 00 00 41 00 00 00 42 01 40 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 03 1a 00 20 30: 00 00 00 00 40 00 00 00 00 00 00 00 63 01 00 00 > > > I think the best solution is make vga_arb_device_init() be wrapped by > > subsys_initcall_sync(), do you think so? > > Hmm. Unfortunately the semantics of subsys_initcall_sync() are not > documented, so I'm not sure exactly *why* such a change would work and > whether we could rely on it to continue working. > > pcibios_init() isn't very consistent across arches. On some, > including alpha, microblaze, some MIPS platforms, powerpc, and sh, it > enumerates PCI devices. On others (ia64, parisc, sparc, x86), it does > basically nothing. That makes life a little difficult. > > vga_arb_device_init() is a subsys_initcall() and wants to look through > all the PCI devices. That's a little problematic for arches where > pcibios_init() is also a subsys_initcall() and enumerates PCI devices. > > Sorry, that's no answer for you. Just more questions :) > > > > > > Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx> > > > > > Signed-off-by: Jingfeng Sui <suijingfeng@xxxxxxxxxxx> > > > > > --- > > > > > drivers/pci/quirks.c | 47 ++++++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 47 insertions(+) > > > > > > > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > > > > index 6ab4b3bba36b..adf5490706ad 100644 > > > > > --- a/drivers/pci/quirks.c > > > > > +++ b/drivers/pci/quirks.c > > > > > @@ -28,6 +28,7 @@ > > > > > #include <linux platform_data="" x86="" apple.h=""> > > > > > #include <linux pm_runtime.h=""> > > > > > #include <linux switchtec.h=""> > > > > > +#include <linux vgaarb.h=""> > > > > > #include <asm dma.h=""> /* isa_dma_bridge_buggy */ > > > > > #include "pci.h" > > > > > > > > > > @@ -297,6 +298,52 @@ static void loongson_mrrs_quirk(struct pci_dev *dev) > > > > > } > > > > > DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, loongson_mrrs_quirk); > > > > > > > > > > + > > > > > +static void aspeed_fixup_vgaarb(struct pci_dev *pdev) > > > > > +{ > > > > > + struct pci_dev *bridge; > > > > > + struct pci_bus *bus; > > > > > + struct pci_dev *vdevp = NULL; > > > > > + u16 config; > > > > > + > > > > > + bus = pdev->bus; > > > > > + bridge = bus->self; > > > > > + > > > > > + /* Is VGA routed to us? */ > > > > > + if (bridge && (pci_is_bridge(bridge))) { > > > > > + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &config); > > > > > + > > > > > + /* Yes, this bridge is PCI bridge-to-bridge spec compliant, > > > > > + * just return! > > > > > + */ > > > > > + if (config & PCI_BRIDGE_CTL_VGA) > > > > > + return; > > > > > + > > > > > + dev_warn(&pdev->dev, "VGA bridge control is not enabled\n"); > > > > > + } > > > > > > > > You cannot assume that a bridge is defective just because > > > > PCI_BRIDGE_CTL_VGA is not set. > > > > > > > > > + /* Just return if the system already have a default device */ > > > > > + if (vga_default_device()) > > > > > + return; > > > > > + > > > > > + /* No default vga device */ > > > > > + while ((vdevp = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, vdevp))) { > > > > > + if (vdevp->vendor != 0x1a03) { > > > > > + /* Have other vga devcie in the system, do nothing */ > > > > > + dev_info(&pdev->dev, "Another boot vga device: 0x%x:0x%x\n", > > > > > + vdevp->vendor, vdevp->device); > > > > > + return; > > > > > + } > > > > > + } > > > > > + > > > > > + vga_set_default_device(pdev); > > > > > + > > > > > + dev_info(&pdev->dev, "Boot vga device set as 0x%x:0x%x\n", > > > > > + pdev->vendor, pdev->device); > > > > > +} > > > > > +DECLARE_PCI_FIXUP_CLASS_FINAL(0x1a03, 0x2000, PCI_CLASS_DISPLAY_VGA, 8, aspeed_fixup_vgaarb); > > > > > + > > > > > + > > > > > /* > > > > > * The Mellanox Tavor device gives false positive parity errors. Disable > > > > > * parity error reporting. > > > > > -- > > > > > 2.27.0 > > > > > </asm></linux></linux></linux></linux></suijingfeng@xxxxxxxxxxx></chenhuacai@xxxxxxxxxxx></perr-></tabort-></access></tabort-></perr-></tabort-></helgaas@xxxxxxxxxx></chenhuacai@xxxxxxxxx></suijingfeng@xxxxxxxxxxx></jiaxun.yang@xxxxxxxxxxx></linux-pci@xxxxxxxxxxxxxxx></bhelgaas@xxxxxxxxxx></chenhuacai@xxxxxxxxxxx></chenhuacai@xxxxxxxxx></helgaas@xxxxxxxxxx>