On Sun, 24 Jul 2022 at 00:09, Javier Martinez Canillas <javierm@xxxxxxxxxx> wrote: > > Drivers' .remove and .shutdown callbacks are executed on different code > paths. The former is called when a device is removed from the bus, while > the latter is called at system shutdown time to quiesce the device. > > This means that some overlap exists between the two, because both have to > take care of properly shutting down the hardware. But currently the logic > used in these two callbacks isn't consistent in msm drivers, which could > lead to kernel oops. > > For example, on .remove the component is deleted and its .unbind callback > leads to the hardware being shutdown but only if the DRM device has been > marked as registered. > > That check doesn't exist in the .shutdown logic and this can lead to the > driver calling drm_atomic_helper_shutdown() for a DRM device that hasn't > been properly initialized. > > A situation when this can happen is when a driver for an expected device > fails to probe, since the .bind callback will never be executed. > > This bug was attempted to be fixed in commit 623f279c7781 ("drm/msm: fix > shutdown hook in case GPU components failed to bind"), but unfortunately > it still happens in some cases. > > Rather than trying to keep fixing in both places, let's unify in a single > helper function that could be used for the two callbacks. > > Signed-off-by: Javier Martinez Canillas <javierm@xxxxxxxxxx> > --- > > drivers/gpu/drm/msm/msm_drv.c | 31 +++++++++++++++++-------------- > 1 file changed, 17 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index 1ed4cd09dbf8..669891bd6f09 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -190,14 +190,8 @@ static int vblank_ctrl_queue_work(struct msm_drm_private *priv, > return 0; > } > > -static int msm_drm_uninit(struct device *dev) > +static void msm_shutdown_hw(struct drm_device *dev) > { > - struct platform_device *pdev = to_platform_device(dev); > - struct msm_drm_private *priv = platform_get_drvdata(pdev); > - struct drm_device *ddev = priv->dev; > - struct msm_kms *kms = priv->kms; > - int i; > - > /* > * Shutdown the hw if we're far enough along where things might be on. > * If we run this too early, we'll end up panicking in any variety of > @@ -205,10 +199,21 @@ static int msm_drm_uninit(struct device *dev) > * msm_drm_init, drm_dev->registered is used as an indicator that the > * shutdown will be successful. > */ > - if (ddev->registered) { > + if (dev->registered) > + drm_atomic_helper_shutdown(dev); > +} > + > +static int msm_drm_uninit(struct device *dev) > +{ > + struct platform_device *pdev = to_platform_device(dev); > + struct msm_drm_private *priv = platform_get_drvdata(pdev); > + struct drm_device *ddev = priv->dev; > + struct msm_kms *kms = priv->kms; > + int i; > + > + if (ddev->registered) > drm_dev_unregister(ddev); No. The drm_dev_unregister() should come before drm_atomic_helper_shutdown(). Also drm_dev_unregister() should not be a part of .shutdown callback. See the documentation in the drm_drv.c > - drm_atomic_helper_shutdown(ddev); > - } > + msm_shutdown_hw(ddev); > > /* We must cancel and cleanup any pending vblank enable/disable > * work before msm_irq_uninstall() to avoid work re-enabling an > @@ -1242,10 +1247,8 @@ void msm_drv_shutdown(struct platform_device *pdev) > struct msm_drm_private *priv = platform_get_drvdata(pdev); > struct drm_device *drm = priv ? priv->dev : NULL; > > - if (!priv || !priv->kms) > - return; > - > - drm_atomic_helper_shutdown(drm); > + if (drm) > + msm_shutdown_hw(drm); > } > > static struct platform_driver msm_platform_driver = { > -- > 2.37.1 > -- With best wishes Dmitry