Hi, On Mon, 29 Nov 2021 at 17:15, AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> wrote: > > Il 29/11/21 03:20, Dmitry Baryshkov ha scritto: > > Hi, > > > > On 25/11/2021 18:09, AngeloGioacchino Del Regno wrote: > >> Since commit 8f59ee9a570c ("drm/msm/dsi: Adjust probe order"), the > >> DSI host gets initialized earlier, but this caused unability to probe > >> the entire stack of components because they all depend on interrupts > >> coming from the main `mdss` node (mdp5, or dpu1). > >> > >> To fix this issue, also anticipate probing mdp5 or dpu1 by initializing > >> them at msm_pdev_probe() time: this will make sure that we add the > >> required interrupt controller mapping before dsi and/or other components > >> try to initialize, finally satisfying the dependency. > >> > >> While at it, also change the allocation of msm_drm_private to use the > >> devm variant of kzalloc(). > >> > >> Fixes: 8f59ee9a570c ("drm/msm/dsi: Adjust probe order") > >> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@xxxxxxxxxxxxx> > > > > I have been thinking about this. I do not feel that this is the correct approach. > > Currently DRM device exists only when all components are bound. If any of the > > subdevices is removed, corresponding component is delteted (and thus all components > > are unbound), the DRM device is taken down. This results in the state cleanup, > > userspace notifications, etc. > > > > With your changes, DRM device will continue to exist even after one of subdevices > > is removed. This is not an expected behaviour, since subdrivers do not perform full > > cleanup, delegating that to DRM device takedown. > > > > I suppose that proper solution would be to split msm_drv.c into into: > > - generic components & drm code to be called from mdp4/mdp5/dpu driver (making > > mdp4, mdp5 or dpu1 the components master) > > > > - bare mdss driver, taking care only about IRQs, OF devices population - calling > > proper mdss_init/mdss_destroy functions. Most probably we can drop this part > > altogether and just make md5_mdss.c/dpu_mdss.c proper platform drivers. > > > > > Hmm... getting a better look on how things are structured... yes, I mostly agree > with you, though I'm not sure about making MDP{4,5}/DPU1 the component master; that > would result in a major change in drm-msm, which may be "a bit too much". > > Don't misunderstand me here, please, major changes are fine - but I feel urgency > to get this bug solved ASAP (since drm-msm is currently broken at least for drm > bridges) and, if we do anything major, that would require a very careful slow > review process that will leave this driver broken for a lot of time. Yep. I wanted to hear your and Rob's opinion, that's why I did not rush into implementing it. I still think this is the way to go in the long term. What I really liked in your patchset was untying the knot between component binds returning EPROBE_DEFER and mdss subdevices being in place. This solved the "dsi host registration" infinite loop for me. > > I actually tried something else that "simplifies" the former approach, so here's > my proposal: > * we introduce {mdp5,dpu}_mdss_early_init(struct device, struct msm_drm_private) > * allocate only msm_drm_private in msm_pdev_probe, leaving the drm_dev_alloc call > into msm_drm_init(), so that the drm_dev_put() stays in msm_drm_uninit() > * pass msm_drm_private as drvdata instead of drm_device > * change all the drvdata users to get drm_device from priv->dev, instead of getting > msm_drm_private from drm_device->dev_private (like many other drm drivers are > currently doing) This sounds in a way that it should work. I'm looking forward to seeing (and testing) your patches. > > This way, we keep the current flow of creating the DRM device at msm_drm_init time > and tearing it down at msm_drm_unbind time, solving the issue that you are > describing. > > If you're okay with this kind of approach, I have two patches here that are 95% > ready, can finish them off and send briefly. > > Though, something else must be noted here... in the last mail where I'm pasting > a crash that happens when running 'rmmod panel_edp ti_sn65dsi86', I have implied > that this is happening due to the patch that I've sent: after some more research, > I'm not convinced anymore that this is a consequence of that. That crash may not > be related to my fix at all, but to something else (perhaps also related to commit > 8f59ee9a570c, the one that we're fixing here). > > Of course, that crash still happens even with the approach that I've just proposed. > > > Looking forward for your opinion! > > Cheers, > - Angelo -- With best wishes Dmitry