On Mon, Apr 6, 2020 at 3:43 PM Lyude Paul <lyude@xxxxxxxxxx> wrote: > > On Mon, 2020-04-06 at 15:41 -0400, Sean Paul wrote: > > On Fri, Apr 3, 2020 at 4:08 PM Lyude Paul <lyude@xxxxxxxxxx> wrote: > > > Currently we only poll for an ACT up to 30 times, with a busy-wait delay > > > of 100µs between each attempt - giving us a timeout of 2900µs. While > > > this might seem sensible, it would appear that in certain scenarios it > > > can take dramatically longer then that for us to receive an ACT. On one > > > of the EVGA MST hubs that I have available, I observed said hub > > > sometimes taking longer then a second before signalling the ACT. These > > > delays mostly seem to occur when previous sideband messages we've sent > > > are NAKd by the hub, however it wouldn't be particularly surprising if > > > it's possible to reproduce times like this simply by introducing branch > > > devices with large LCTs since payload allocations have to take effect on > > > every downstream device up to the payload's target. > > > > > > So, instead of just retrying 30 times we poll for the ACT for up to 3ms, > > > and additionally use usleep_range() to avoid a very long and rude > > > busy-wait. Note that the previous retry count of 30 appears to have been > > > arbitrarily chosen, as I can't find any mention of a recommended timeout > > > or retry count for ACTs in the DisplayPort 2.0 specification. This also > > > goes for the range we were previously using for udelay(), although I > > > suspect that was just copied from the recommended delay for link > > > training on SST devices. > > > > > > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx> > > > Fixes: ad7f8a1f9ced ("drm/helper: add Displayport multi-stream helper > > > (v0.6)") > > > Cc: Sean Paul <sean@xxxxxxxxxx> > > > Cc: <stable@xxxxxxxxxxxxxxx> # v3.17+ > > > --- > > > drivers/gpu/drm/drm_dp_mst_topology.c | 26 +++++++++++++++++++------- > > > 1 file changed, 19 insertions(+), 7 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c > > > b/drivers/gpu/drm/drm_dp_mst_topology.c > > > index 7aaf184a2e5f..f313407374ed 100644 > > > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > > > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > > > @@ -4466,17 +4466,30 @@ static int drm_dp_dpcd_write_payload(struct > > > drm_dp_mst_topology_mgr *mgr, > > > * @mgr: manager to use > > > * > > > * Tries waiting for the MST hub to finish updating it's payload table by > > > - * polling for the ACT handled bit. > > > + * polling for the ACT handled bit for up to 3 seconds (yes-some hubs > > > really > > > + * take that long). > > > * > > > * Returns: > > > * 0 if the ACT was handled in time, negative error code on failure. > > > */ > > > int drm_dp_check_act_status(struct drm_dp_mst_topology_mgr *mgr) > > > { > > > - int count = 0, ret; > > > + /* > > > + * There doesn't seem to be any recommended retry count or timeout > > > in > > > + * the MST specification. Since some hubs have been observed to > > > take > > > + * over 1 second to update their payload allocations under certain > > > + * conditions, we use a rather large timeout value. > > > + */ > > > + const int timeout_ms = 3000; > > > + unsigned long timeout = jiffies + msecs_to_jiffies(timeout_ms); > > > + int ret; > > > + bool retrying = false; > > > u8 status; > > > > > > do { > > > + if (retrying) > > > + usleep_range(100, 1000); > > > + > > > ret = drm_dp_dpcd_readb(mgr->aux, > > > DP_PAYLOAD_TABLE_UPDATE_STATUS, > > > &status); > > > @@ -4488,13 +4501,12 @@ int drm_dp_check_act_status(struct > > > drm_dp_mst_topology_mgr *mgr) > > > > > > if (status & DP_PAYLOAD_ACT_HANDLED) > > > break; > > > - count++; > > > - udelay(100); > > > - } while (count < 30); > > > + retrying = true; > > > + } while (jiffies < timeout); > > > > Somewhat academic, but I think there's an overflow possibility here if > > timeout is near ulong_max and jiffies overflows during the usleep. In > > that case we'll be retrying for a very loong time. > > > > I wish we had i915's wait_for() macro available to all drm... > > Maybe we could add it to the kernel library somewhere? I don't see why we'd > need to stop at DRM So You Want To Build A Bikeshed... Seriously though, I'd be very happy with that. Alternatively you could shoehorn this into readx_poll_timeout as well. Sean > > > > > Sean > > > > > if (!(status & DP_PAYLOAD_ACT_HANDLED)) { > > > - DRM_DEBUG_KMS("failed to get ACT bit %d after %d > > > retries\n", > > > - status, count); > > > + DRM_DEBUG_KMS("failed to get ACT bit %d after %dms\n", > > > + status, timeout_ms); > > > return -EINVAL; > > > } > > > return 0; > > > -- > > > 2.25.1 > > > > -- > Cheers, > Lyude Paul (she/her) > Associate Software Engineer at Red Hat >