On Fri, Apr 3, 2020 at 4:08 PM Lyude Paul <lyude@xxxxxxxxxx> wrote: > > Currently we only poll for an ACT up to 30 times, with a busy-wait delay > of 100µs between each attempt - giving us a timeout of 2900µs. While > this might seem sensible, it would appear that in certain scenarios it > can take dramatically longer then that for us to receive an ACT. On one > of the EVGA MST hubs that I have available, I observed said hub > sometimes taking longer then a second before signalling the ACT. These > delays mostly seem to occur when previous sideband messages we've sent > are NAKd by the hub, however it wouldn't be particularly surprising if > it's possible to reproduce times like this simply by introducing branch > devices with large LCTs since payload allocations have to take effect on > every downstream device up to the payload's target. > > So, instead of just retrying 30 times we poll for the ACT for up to 3ms, > and additionally use usleep_range() to avoid a very long and rude > busy-wait. Note that the previous retry count of 30 appears to have been > arbitrarily chosen, as I can't find any mention of a recommended timeout > or retry count for ACTs in the DisplayPort 2.0 specification. This also > goes for the range we were previously using for udelay(), although I > suspect that was just copied from the recommended delay for link > training on SST devices. > > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx> > Fixes: ad7f8a1f9ced ("drm/helper: add Displayport multi-stream helper (v0.6)") > Cc: Sean Paul <sean@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v3.17+ > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 26 +++++++++++++++++++------- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c > index 7aaf184a2e5f..f313407374ed 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -4466,17 +4466,30 @@ static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr, > * @mgr: manager to use > * > * Tries waiting for the MST hub to finish updating it's payload table by > - * polling for the ACT handled bit. > + * polling for the ACT handled bit for up to 3 seconds (yes-some hubs really > + * take that long). > * > * Returns: > * 0 if the ACT was handled in time, negative error code on failure. > */ > int drm_dp_check_act_status(struct drm_dp_mst_topology_mgr *mgr) > { > - int count = 0, ret; > + /* > + * There doesn't seem to be any recommended retry count or timeout in > + * the MST specification. Since some hubs have been observed to take > + * over 1 second to update their payload allocations under certain > + * conditions, we use a rather large timeout value. > + */ > + const int timeout_ms = 3000; > + unsigned long timeout = jiffies + msecs_to_jiffies(timeout_ms); > + int ret; > + bool retrying = false; > u8 status; > > do { > + if (retrying) > + usleep_range(100, 1000); > + > ret = drm_dp_dpcd_readb(mgr->aux, > DP_PAYLOAD_TABLE_UPDATE_STATUS, > &status); > @@ -4488,13 +4501,12 @@ int drm_dp_check_act_status(struct drm_dp_mst_topology_mgr *mgr) > > if (status & DP_PAYLOAD_ACT_HANDLED) > break; > - count++; > - udelay(100); > - } while (count < 30); > + retrying = true; > + } while (jiffies < timeout); Somewhat academic, but I think there's an overflow possibility here if timeout is near ulong_max and jiffies overflows during the usleep. In that case we'll be retrying for a very loong time. I wish we had i915's wait_for() macro available to all drm... Sean > > if (!(status & DP_PAYLOAD_ACT_HANDLED)) { > - DRM_DEBUG_KMS("failed to get ACT bit %d after %d retries\n", > - status, count); > + DRM_DEBUG_KMS("failed to get ACT bit %d after %dms\n", > + status, timeout_ms); > return -EINVAL; > } > return 0; > -- > 2.25.1 >