Hi Daniel, On Tue, Apr 5, 2022 at 1:48 PM Daniel Vetter <daniel@xxxxxxxx> wrote: > On Tue, 5 Apr 2022 at 11:52, Javier Martinez Canillas > <javierm@xxxxxxxxxx> wrote: > > On 4/5/22 11:24, Daniel Vetter wrote: > > > On Tue, 5 Apr 2022 at 11:19, Javier Martinez Canillas > > >> This is how I think that work, please let me know if you see something > > >> wrong in my logic: > > >> > > >> 1) A PCI device of OF device is registered for the GPU, this attempt to > > >> match a registered driver but no driver was registered that match yet. > > >> > > >> 2) The efifb driver is built-in, will be initialized according to the link > > >> order of the objects under drivers/video and the fbdev driver is registered. > > >> > > >> There is no platform device or PCI/OF device registered that matches. > > >> > > >> 3) The DRM driver is built-in, will be initialized according to the link > > >> order of the objects under drivers/gpu and the DRM driver is registered. > > >> > > >> This matches the device registered in (1) and the DRM driver probes. > > >> > > >> 4) The DRM driver .probe kicks out any conflicting DRM drivers and pdev > > >> before registering the DRM device. > > >> > > >> There are no conflicting drivers or platform device at this point. > > >> > > >> 5) Latter at some point the drivers/firmware/sysfb.c init function is > > >> executed, and this registers a platform device for the generic fb. > > >> > > >> This device matches the efifb driver registered in (2) and the fbdev > > >> driver probes. > > >> > > >> Since that happens *after* the DRM driver already matched, probed > > >> and registered the DRM device, that is a bug and what the reverted > > >> patch worked around. > > >> > > >> So we need to prevent (5) if (1) and (3) already happened. Having a flag > > >> set in the fbdev core somewhere when remove_conflicting_framebuffers() > > >> is called could be a solution indeed. > > >> > > >> That is, the fbdev core needs to know that a DRM driver already probed > > >> and make register_framebuffer() fail if info->flag & FBINFO_MISC_FIRMWARE > > >> > > >> I can attempt to write a patch for that. > > > > > > Ah yeah that could be an issue. I think the right fix is to replace > > > the platform dev unregister with a sysfb_unregister() function in > > > sysfb.c, which is synced with a common lock with the sysfb_init > > > function and a small boolean. I think I can type that up quickly for > > > v3. > > > > It's more complicated than that since sysfb is just *one* of the several > > places where platform devices can be registered for video devices. > > > > For instance, the vga16fb driver registers its own platform device in > > its module_init() function so that can also happen after the conflicting > > framebuffers (and associated devices) were removed by a DRM driver probe. > > > > I tried to minimize the issue for that particular driver with commit: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0499f419b76f > > > > But the point stands, it all boils down to the fact that you have two > > different subsystems registering video drivers and they don't know all > > about each other to take a proper decision. > > > > Right now the drm_aperture_remove_conflicting_framebuffers() call signals > > in one direction from DRM to fbdev but there isn't a communication in the > > other direction, from fbdev to DRM. > > > > I believe the correct fix would be for the fbdev core to keep a list of > > the apertures struct that are passed to remove_conflicting_framebuffers(), > > that way it will know what apertures are not available anymore and prevent > > to register any fbdev framebuffer that conflicts with one already present. > > Hm that still feels like reinventing a driver model, badly. > > I think there's two cleaner solutions: > - move all the firmware driver platform_dev into sysfb.c, and then > just bind the special cases against that (e.g. offb, vga16fb and all > these). Then we'd have one sysfb_try_unregister(struct device *dev) > interface that fbmem.c uses. > - let fbmem.c call into each of these firmware device providers, which > means some loops most likely (like we can't call into vga16fb), so > probably need to move that into fbmem.c and it all gets a bit messy. > > > Let me know if you think that makes sense and I can attempt to write a fix. > > I still think unregistering the platform_dev properly makes the most That doesn't sound very driver-model-aware to me. The device is what the driver binds to; it does not cease to exist. > sense, and feels like the most proper linux device model solution > instead of hacks on top - if the firmware fb is unuseable because a > native driver has taken over, we should nuke that. And also the > firmware fb driver would then just bind to that platform_dev if it > exists, and only if it exists. Also I think it should be the > responsibility of whichever piece of code that registers these > platform devices to ensure that platform_dev actually still exists. > That's why I think pushing all that code into sysfb.c is probably the > cleanest solution. Can't you unbind the generic driver first, and bind the specific driver afterwards? Alike writing to sysfs unbind/driver_override/bind, but from code? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds