Quoting Rob Clark (2021-03-24 20:09:37) > On Wed, Mar 24, 2021 at 6:49 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote: > > > > Quoting Dmitry Baryshkov (2021-03-18 13:05:44) > > > if GPU components have failed to bind, shutdown callback would fail with > > > the following backtrace. Add safeguard check to stop that oops from > > > happening and allow the board to reboot. > > [...] > > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > > > index 94525ac76d4e..fd2ac54caf9f 100644 > > > --- a/drivers/gpu/drm/msm/msm_drv.c > > > +++ b/drivers/gpu/drm/msm/msm_drv.c > > > @@ -1311,6 +1311,10 @@ static int msm_pdev_remove(struct platform_device *pdev) > > > static void msm_pdev_shutdown(struct platform_device *pdev) > > > { > > > struct drm_device *drm = platform_get_drvdata(pdev); > > > + struct msm_drm_private *priv = drm ? drm->dev_private : NULL; > > > + > > > + if (!priv || !priv->kms) > > > + return; > > > > > > > I see a problem where if I don't get a backlight probing then my > > graphics card doesn't appear but this driver is still bound. I was > > hoping this patch would fix it but it doesn't. I have slab poisoning > > enabled so sometimes the 'priv' pointer is 0x6b6b6b6b6b6b6b6b meaning it > > got all freed. > > > > I found that the 'drm' pointer here is pointing at junk. The > > msm_drm_init() function calls drm_dev_put() on the error path and that > > will destroy the drm pointer but it doesn't update this platform drivers > > drvdata. Do we need another patch that sets the drvdata to NULL on > > msm_drm_init() failing? One last note, I'm seeing this on 5.4 so maybe I > > missed something and the drvdata has been set to NULL somewhere else > > upstream. I sort of doubt it though. > > the hw that I guess you are running on should work pretty well w/ > upstream kernel.. but I don't think there is any important delta > between upstream and the 5.4 based kernel that you are running that > would fix this.. > > so *probably* you are right.. linux-next is failing like this today for me on Lazor right after the screen turns on. I'll have to figure out what's wrong before checking upstream. [ 10.734752] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 [ 10.744482] Mem abort info: [ 10.747462] ESR = 0x96000006 [ 10.750644] EC = 0x25: DABT (current EL), IL = 32 bits [ 10.756125] SET = 0, FnV = 0 [ 10.759290] EA = 0, S1PTW = 0 [ 10.762543] Data abort info: [ 10.765519] ISV = 0, ISS = 0x00000006 [ 10.769485] CM = 0, WnR = 0 [ 10.772553] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000123474000 [ 10.779212] [0000000000000080] pgd=0800000123475003, p4d=0800000123475003, pud=0800000123475003, pmd=0000000000000000 [ 10.790128] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 10.795856] Modules linked in: ath10k_snoc qmi_helpers ath10k_core ath mac80211 cfg80211 r8152 mii joydev [ 10.805705] CPU: 5 PID: 1576 Comm: DrmThread Not tainted 5.12.0-rc4-next-20210324+ #13 [ 10.813832] Hardware name: Google Lazor (rev3+) with KB Backlight (DT) [ 10.820535] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--) [ 10.826703] pc : dpu_plane_atomic_update+0x80/0xcb8 [ 10.831730] lr : dpu_plane_restore+0x5c/0x88 [ 10.836117] sp : ffffffc012963920 [ 10.839521] x29: ffffffc0129639c0 x28: ffffffed5c9ad000 [ 10.844979] x27: ffffffed5c736000 x26: ffffffed5ca3f000 [ 10.850443] x25: ffffffed5c736000 x24: 0000000000000000 [ 10.855903] x23: 0000000000000000 x22: ffffff80ad007400 [ 10.861361] x21: ffffff8085193808 x20: 0000000000000000 [ 10.866818] x19: ffffff8085193800 x18: 0000000000000008 [ 10.872274] x17: 0000000000800000 x16: 0000000020000000 [ 10.877738] x15: 0000000000000001 x14: 0000000000000000 [ 10.883201] x13: ffffff80852324a8 x12: 0000000000000008 [ 10.888657] x11: ffffffed5c3b7890 x10: 0000000000000000 [ 10.894112] x9 : 0000000000000000 x8 : 0000000000000000 [ 10.899570] x7 : 0000000000004000 x6 : 0000000000010000 [ 10.905026] x5 : 0000000000040000 x4 : 0000000000000800 [ 10.910482] x3 : 0000000000000000 x2 : 0000000000020041 [ 10.915946] x1 : ffffff80ad2e2600 x0 : ffffff8085193800 [ 10.921402] Call trace: [ 10.923923] dpu_plane_atomic_update+0x80/0xcb8 [ 10.928585] dpu_plane_restore+0x5c/0x88 [ 10.932620] dpu_crtc_atomic_flush+0xd4/0x1a0 [ 10.937105] drm_atomic_helper_commit_planes+0x1b4/0x1e0 [ 10.942565] msm_atomic_commit_tail+0x2d4/0x670 [ 10.947223] commit_tail+0xac/0x148 [ 10.950814] drm_atomic_helper_commit+0x104/0x10c [ 10.955653] drm_atomic_commit+0x58/0x68 [ 10.959686] drm_mode_atomic_ioctl+0x438/0x51c [ 10.964261] drm_ioctl_kernel+0xa8/0x124 [ 10.968295] drm_ioctl+0x24c/0x3ec [ 10.971800] drm_compat_ioctl+0xe0/0xf4 [ 10.975745] __arm64_compat_sys_ioctl+0xcc/0x104 [ 10.980499] el0_svc_common+0xa4/0x128 [ 10.984358] do_el0_svc_compat+0x2c/0x38 [ 10.988395] el0_svc_compat+0x20/0x30 [ 10.992164] el0_sync_compat_handler+0xc0/0xf0 [ 10.996734] el0_sync_compat+0x174/0x180 [ 11.000774] Code: d0003d61 91204821 52800020 97fe8c65 (39420288)