On Fri, Jun 04, 2021 at 12:50:03PM +0800, Huacai Chen wrote: > On Thu, Jun 3, 2021 at 2:31 AM Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: > > > > [+cc linux-pci] > > > > On Wed, Jun 2, 2021 at 11:22 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > > On Wed, Jun 02, 2021 at 06:36:03PM +0800, Huacai Chen wrote: > > > > On Wed, Jun 2, 2021 at 2:03 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > > > > On Tue, Jun 01, 2021 at 07:12:27PM +0200, Greg KH wrote: > > > > > > On Tue, Jun 01, 2021 at 05:56:40PM +0200, Daniel Vetter wrote: > > > > > > > On Fri, May 28, 2021 at 04:26:07PM +0800, Huacai Chen wrote: > > > > > > > > We should call vga_arb_device_init() after PCI enumeration, otherwise it > > > > > > > > may fail to select the default VGA device. Since vga_arb_device_init() > > > > > > > > and PCI enumeration function (i.e., pcibios_init() or acpi_init()) are > > > > > > > > both wrapped by subsys_initcall(), their sequence is not assured. So, we > > > > > > > > use subsys_initcall_sync() instead of subsys_initcall() to wrap vga_arb_ > > > > > > > > device_init(). > > > > > > > > > > > > Trying to juggle levels like this always fails if you build the code as > > > > > > a module. > > > > > > > > > > > > Why not fix it properly and handle the out-of-order loading by returning > > > > > > a "deferred" error if you do not have your resources yet? > > > > > > > > > > It's not a driver, it's kinda a bolted-on-the-side subsytem of pci. So not > > > > > something you can -EPROBE_DEFER I think, without potentially upsetting the > > > > > drivers that need this. > > > > > > > > > > Which might mean we should move this into pci subsystem proper perhaps? > > > > > Then adding the init call at the right time becomes trivial since we just > > > > > plug it in at the end of pci init. > > > > > > > > > > Also maybe that's how distros avoid this pain, pci is built-in, vgaarb is > > > > > generally a module, problem solved. > > > > > > > > > > Bjorn, would you take this entire vgaarb.c thing? From a quick look I > > > > > don't think it has a drm-ism in it (unlike vga_switcheroo, but that works > > > > > a bit differently and doesn't have this init order issue). > > > > Emmm, this patch cannot handle the hotplug case and module case, it > > > > just handles the case that vgaarb, drm driver and pci all built-in. > > > > But I think this is enough, because the original problem only happens > > > > on very few BMC-based VGA cards (BMC doesn't set the VGA Enable bit on > > > > the bridge, which breaks vgaarb). > > > > > > I'm not talking aout hotplug, just ordering the various pieces correctly. > > > That vgaarb isn't really a driver and also can't really handle hotplug is > > > my point. I guess that got lost a bit? > > > > > > Anyway my proposal is essentially to do a > > > > > > $ git move drivers/gpu/vga/vgaarb.c drivers/pci > > > > > > But I just realized that vgaarb is a bool option, so module isn't possible > > > anyway, and we could fix this by calling vgaarb from pcibios init (with an > > > empty static inline in the header if vgaarb is disabled). That makes the > > > dependency very explicit and guarantees it works correctly. > > > > pcibios_init() is also an initcall and is implemented by every arch. > > I agree that calling vga_arb_device_init() directly from > > pcibios_init() would probably fix this problem, and it would be really > > nice to have it not be an initcall. But it's also kind of a pain to > > have to update all those copies of pcibios_init(), and I would be > > looking for a way to unify it since it's not really an arch-specific > > thing. > > > > I think the simplest solution, which I suggested earlier [1], would be > > to explicitly call vga_arbiter_add_pci_device() directly from the PCI > > core when it enumerates a VGA device. Then there's no initcall and no > > need for the BUS_NOTIFY_ADD/DEL_DEVICE stuff. > > vga_arbiter_add_pci_device() could set the default VGA device when it > > is enumerated, and change the default device if we enumerate a > > "better" one. And hotplug VGA devices would work automatically. > Emm, It seems that your solution has some difficulties to remove the > whole initcall(vga_arb_device_init): we call > vga_arbiter_add_pci_device() in pci_bus_add_device(), the > list_for_each_entry() loop can be moved to > vga_arbiter_check_bridge_sharing(), vga_arb_select_default_device() > can be renamed to vga_arb_update_default_device() and be called in > vga_arbiter_add_pci_device(), but how to handle > misc_register(&vga_arb_device)? Might need to keep vga_arb_device_init() as an initcall, but remove everything from it except the misc_register().