Hi Stephen
On 8/10/2022 5:09 PM, Stephen Boyd wrote:
Quoting Kuogee Hsieh (2022-08-10 16:57:51)
On 8/10/2022 3:22 PM, Stephen Boyd wrote:
Quoting Kuogee Hsieh (2022-08-10 12:25:51)
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index b36f8b6..678289a 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1729,10 +1729,20 @@ void dp_bridge_disable(struct drm_bridge *drm_bridge)
struct msm_dp_bridge *dp_bridge = to_dp_bridge(drm_bridge);
struct msm_dp *dp = dp_bridge->dp_display;
struct dp_display_private *dp_display;
+ u32 state;
dp_display = container_of(dp, struct dp_display_private, dp_display);
+ mutex_lock(&dp_display->event_mutex);
+
+ state = dp_display->hpd_state;
+ if (state != ST_DISCONNECT_PENDING && state != ST_CONNECTED) {
It's concerning that we have to check this at all. Are we still
interjecting into the disable path when the cable is disconnected?
yes,
The problem is not from cable disconnected.
There is a corner case that this function is called at drm shutdown
(drm_release).
At that time, mainlink is not enabled, hence dp_ctrl_push_idle() will
cause system crash.
The mainlink is only disabled when the cable is disconnected though?
Let me put it this way, if we have to check that the state is
"connected" or "disconnected pending" in the disable path then there's
an issue where this driver is being called in unexpected ways. This
driver is fighting the drm core each time there's a state check. We
really need to get rid of the state tracking entirely, and make sure
that the drm core is calling into the driver at the right time, i.e.
bridge disable is only called when the mainlink is enabled, etc.
So if link training failed, we do not send a uevent to usermode and will
bail out here:
rc = dp_ctrl_on_link(dp->ctrl);
if (rc) {
DRM_ERROR("failed to complete DP link training\n");
goto end;
}
So this commit is not coming from usermode but from the drm_release() path.
Even then, you do have a valid point. DRM framework should not have
caused the disable path to happen without an enable.
I went through the stack mentioned in the issue.
Lets see this part of the stack:
dp_ctrl_push_idle+0x40/0x88
dp_bridge_disable+0x24/0x30
drm_atomic_bridge_chain_disable+0x90/0xbc
drm_atomic_helper_commit_modeset_disables+0x198/0x444
msm_atomic_commit_tail+0x1d0/0x374
In drm_atomic_helper_commit_modeset_disables(), we call disable_outputs().
AFAICT, this is the only place which has a protection to not call the
disable() flow if it was not enabled here:
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/drm_atomic_helper.c#L1063
But that function is only checking crtc_state->active. Thats set by the
usermode:
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/drm_atomic_uapi.c#L407
Now, if usermode sets that to true and then crashed it can bypass this
check and we will crash in the location kuogee is trying to fix.
From the issue mentioned in
https://gitlab.freedesktop.org/drm/msm/-/issues/17, the reporter did
mention the usermode crashed.
So this is my tentative analysis of whats happening here.
Ideally yes, we should have been protected by the location mentioned
above in disable_outputs() but looks to me due to the above hypothesis
its getting bypassed.
Thanks
Abhinav