On Tue, May 10, 2022 at 09:50:38AM +0200, Javier Martinez Canillas wrote: > On 5/10/22 09:19, Andrzej Hajda wrote: > > > > > > On 10.05.2022 00:42, Javier Martinez Canillas wrote: > >> On 5/10/22 00:22, Andrzej Hajda wrote: > >> > >> [snip] > >> > >>>> static void drm_fbdev_fb_destroy(struct fb_info *info) > >>>> { > >>>> + if (info->cmap.len) > >>>> + fb_dealloc_cmap(&info->cmap); > >>>> + > >>>> drm_fbdev_release(info->par); > >>>> + framebuffer_release(info); > >>> I would put drm_fbdev_release at the beginning - it cancels workers > >>> which could expect cmap to be still valid. > >>> > >> Indeed, you are correct again. [0] is the final version of the patch I've > >> but don't have an i915 test machine to give it a try. I'll test tomorrow > >> on my test systems to verify that it doesn't cause any regressions since > >> with other DRM drivers. > >> > >> I think that besides this patch, drivers shouldn't need to call to the > >> drm_fb_helper_fini() function directly. Since that would be called during > >> drm_fbdev_fb_destroy() anyways. > >> > >> We should probably remove that call in all drivers and make this helper > >> function static and just private to drm_fb_helper functions. > >> > >> Or am I missing something here ? > > > > This is question for experts :) > > Fair. I'm definitely not one of them :) > > > I do not know what are user API/ABI expectations regarding removal of > > fbdev driver, I wonder if they are documented somewhere :) > > I don't know. At least I haven't found them. > > > Apparently we have some process of 'zombification' here - we need to > > remove the driver without waiting for userspace closing framebuffer(???) > > (to unbind ops-es and remove references to driver related things), but > > we need to leave some structures to fool userspace, 'info' seems to be > > one of them. > > That's correct, yes. I think that any driver that provides a .mmap file > operation would have the same issue. But drivers keep an internal state > and just return -ENODEV or whatever on read/write/close after a removal. Just commenting on the mmap part here. I think there's two options: - shadow buffer for fbdev defio, and keep the shadow buffer around until fb_destroy - redirect fbdev mmap fully to gem mmap, and make sure the gem mmap is hotunplug safe. The approach amd folks are pushing for that we discussed is to replace them all with a dummy r/w page, because removing the ptes means you can get a SIGBUS almost anywhere in application code, and that violates like all the assumptions behind gl/vk and would just crash your desktop. Reading/writing garbage otoh is generally much better. So yeah hotunplug safe fbdev mmap is still quite a bit of work ... Cheers, Daniel > > The fbdev subsystem is different though since as you said it, the fbdev > core unconditionally calls to the driver .fb_release() callback with a > struct fb_info reference as argument. > > I tried to prevent that with commit aafa025c76dc ("fbdev: Make fb_release() > return -ENODEV if fbdev was unregistered") but Daniel pointed out that > is was wrong since could leak memory allocated and was expected to be > freed on release. > > That's why I instead fixed the issue in the fbdev drivers and just added > a warn on fb_release(), that is $SUBJECT. > > > So I guess there should be something called on driver's _remove path, > > and sth on destroy path. > > > > That was my question actually, do we need something to be called in the > destroy path ? Since that could just be internal to the DRM fb helpers. > > In other words, drivers should only care about setting a generic fbdev > by calling drm_fbdev_generic_setup(), and then do any HW cleanup in the > removal path, but let the fb helpers to handle the SW cleanup in destroy. > > -- > Best regards, > > Javier Martinez Canillas > Linux Engineering > Red Hat > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch