On Wed, 2023-04-19 at 16:54 -0400, Lyude Paul wrote: > Reviewed-by: Lyude Paul <lyude@xxxxxxxxxx> > > Thanks! > > On Wed, 2023-04-19 at 07:24 -0400, Jeff Layton wrote: > > I've been experiencing some intermittent crashes down in the display > > driver code. The symptoms are ususally a line like this in dmesg: > > > > amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port 000000006d3a3885: -5 > > > > ...followed by an Oops due to a NULL pointer dereference. > > > > Switch to using mgr->dev instead of state->dev since "state" can be > > NULL in some cases. > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2184855 > > Suggested-by: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > > --- > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > I've been running this patch for a couple of days, but the problem > > hasn't occurred again as of yet. It seems sane though as long as we can > > assume that mgr->dev will be valid even when "state" is a NULL pointer. > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > index 38dab76ae69e..e2e21ce79510 100644 > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c > > @@ -3404,7 +3404,7 @@ int drm_dp_add_payload_part2(struct drm_dp_mst_topology_mgr *mgr, > > > > /* Skip failed payloads */ > > if (payload->vc_start_slot == -1) { > > - drm_dbg_kms(state->dev, "Part 1 of payload creation for %s failed, skipping part 2\n", > > + drm_dbg_kms(mgr->dev, "Part 1 of payload creation for %s failed, skipping part 2\n", > > payload->port->connector->name); > > return -EIO; > > } > Thanks! BTW, I've had a couple more of these events in the last few days: [20199.446159] amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port 00000000556eb455: -5 [20199.508379] [drm] DM_MST: stopping TM on aconnector: 000000001c0c0284 [id: 86] [20200.064417] [drm] DM_MST: starting TM on aconnector: 000000001c0c0284 [id: 86] The patch prevents an Oops, but GNOME seems to decide that a different monitor is primary and moves all of the windows on the desktop around (I have 2 monitors). Mostly this seems to happen when I walk away from the machine for a bit, so I suspect it's associated with the display going to sleep. At one point, Wayne said he might know the root cause of this. If there are patches that you need help testing, I can do that. I'm having to build my own kernels anyway until this patch makes it into the distros. -- Jeff Layton <jlayton@xxxxxxxxxx>