On Wed, 14 Nov 2018 at 08:47, Lyude Paul <lyude@xxxxxxxxxx> wrote: > > Jerry Zuo pointed out a rather obscure hotplugging issue that it seems I > accidentally introduced into DRM two years ago. > > Pretend we have a topology like this: > > |- DP-1: mst_primary > |- DP-4: active display > |- DP-5: disconnected > |- DP-6: active hub > |- DP-7: active display > |- DP-8: disconnected > |- DP-9: disconnected > > If we unplug DP-6, the topology starting at DP-7 will be destroyed but > it's payloads will live on in DP-1's VCPI allocations and thus require > removal. However, this removal currently fails because > drm_dp_update_payload_part1() will (rightly so) try to validate the port > before accessing it, fail then abort. If we keep going, eventually we > run the MST hub out of bandwidth and all new allocations will start to > fail (or in my case; all new displays just start flickering a ton). > > We could just teach drm_dp_update_payload_part1() not to drop the port > ref in this case, but then we also need to teach > drm_dp_destroy_payload_step1() to do the same thing, then hope no one > ever adds anything to the that requires a validated port reference in > drm_dp_destroy_connector_work(). Kind of sketchy. > > So let's go with a more clever solution: any port that > drm_dp_destroy_connector_work() interacts with is guaranteed to still > exist in memory until we say so. While said port might not be valid we > don't really care: that's the whole reason we're destroying it in the > first place! So, teach drm_dp_get_validated_port_ref() to use the all > mighty current_work() function to avoid attempting to validate ports > from the context of mgr->destroy_connector_work. I can't see any > situation where this wouldn't be safe, and this avoids having to play > whack-a-mole in the future of trying to work around port validation. > > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx> > Fixes: 263efde31f97 ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()") > Reported-by: Jerry Zuo <Jerry.Zuo@xxxxxxx> > Cc: Jerry Zuo <Jerry.Zuo@xxxxxxx> > Cc: Harry Wentland <Harry.Wentland@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v4.6+ Much as I constantly feel we are missing some better way to do all of this, this seems like a reasonable fix. Reviewed-by: Dave Airlie <airlied@xxxxxxxxxx> > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c > index 529414556962..08978ad72f33 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -1023,9 +1023,20 @@ static struct drm_dp_mst_port *drm_dp_mst_get_port_ref_locked(struct drm_dp_mst_ > static struct drm_dp_mst_port *drm_dp_get_validated_port_ref(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port) > { > struct drm_dp_mst_port *rport = NULL; > + > mutex_lock(&mgr->lock); > - if (mgr->mst_primary) > - rport = drm_dp_mst_get_port_ref_locked(mgr->mst_primary, port); > + /* > + * Port may or may not be 'valid' but we don't care about that when > + * destroying the port and we are guaranteed that the port pointer > + * will be valid until we've finished > + */ > + if (current_work() == &mgr->destroy_connector_work) { > + kref_get(&port->kref); > + rport = port; > + } else if (mgr->mst_primary) { > + rport = drm_dp_mst_get_port_ref_locked(mgr->mst_primary, > + port); > + } > mutex_unlock(&mgr->lock); > return rport; > } > -- > 2.19.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/dri-devel