On Mon, Apr 6, 2020 at 6:13 PM Lyude Paul <lyude@xxxxxxxxxx> wrote: > > Currently we only poll for an ACT up to 30 times, with a busy-wait delay > of 100µs between each attempt - giving us a timeout of 2900µs. While > this might seem sensible, it would appear that in certain scenarios it > can take dramatically longer then that for us to receive an ACT. On one > of the EVGA MST hubs that I have available, I observed said hub > sometimes taking longer then a second before signalling the ACT. These > delays mostly seem to occur when previous sideband messages we've sent > are NAKd by the hub, however it wouldn't be particularly surprising if > it's possible to reproduce times like this simply by introducing branch > devices with large LCTs since payload allocations have to take effect on > every downstream device up to the payload's target. > > So, instead of just retrying 30 times we poll for the ACT for up to 3ms, > and additionally use usleep_range() to avoid a very long and rude > busy-wait. Note that the previous retry count of 30 appears to have been > arbitrarily chosen, as I can't find any mention of a recommended timeout > or retry count for ACTs in the DisplayPort 2.0 specification. This also > goes for the range we were previously using for udelay(), although I > suspect that was just copied from the recommended delay for link > training on SST devices. > > Changes since v1: > * Use readx_poll_timeout() instead of open-coding timeout loop - Sean > Paul > > Signed-off-by: Lyude Paul <lyude@xxxxxxxxxx> > Fixes: ad7f8a1f9ced ("drm/helper: add Displayport multi-stream helper (v0.6)") > Cc: Sean Paul <sean@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v3.17+ > --- > drivers/gpu/drm/drm_dp_mst_topology.c | 57 ++++++++++++++++----------- > 1 file changed, 34 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c > index c83adbdfc1cd..ce61964baa7c 100644 > --- a/drivers/gpu/drm/drm_dp_mst_topology.c > +++ b/drivers/gpu/drm/drm_dp_mst_topology.c > @@ -27,6 +27,7 @@ > #include <linux/kernel.h> > #include <linux/sched.h> > #include <linux/seq_file.h> > +#include <linux/iopoll.h> > > #if IS_ENABLED(CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS) > #include <linux/stacktrace.h> > @@ -4460,43 +4461,53 @@ static int drm_dp_dpcd_write_payload(struct drm_dp_mst_topology_mgr *mgr, > return ret; > } > > +static int do_get_act_status(struct drm_dp_aux *aux) > +{ > + int ret; > + u8 status; > + > + ret = drm_dp_dpcd_readb(aux, DP_PAYLOAD_TABLE_UPDATE_STATUS, &status); > + if (ret < 0) > + return ret; > + > + return status; > +} > > /** > * drm_dp_check_act_status() - Polls for ACT handled status. > * @mgr: manager to use > * > * Tries waiting for the MST hub to finish updating it's payload table by > - * polling for the ACT handled bit. > + * polling for the ACT handled bit for up to 3 seconds (yes-some hubs really > + * take that long). > * > * Returns: > * 0 if the ACT was handled in time, negative error code on failure. > */ > int drm_dp_check_act_status(struct drm_dp_mst_topology_mgr *mgr) > { > - int count = 0, ret; > - u8 status; > - > - do { > - ret = drm_dp_dpcd_readb(mgr->aux, > - DP_PAYLOAD_TABLE_UPDATE_STATUS, > - &status); > - if (ret < 0) { > - DRM_DEBUG_KMS("failed to read payload table status %d\n", > - ret); > - return ret; > - } > - > - if (status & DP_PAYLOAD_ACT_HANDLED) > - break; > - count++; > - udelay(100); > - } while (count < 30); > - > - if (!(status & DP_PAYLOAD_ACT_HANDLED)) { > - DRM_DEBUG_KMS("failed to get ACT bit %d after %d retries\n", > - status, count); > + /* > + * There doesn't seem to be any recommended retry count or timeout in > + * the MST specification. Since some hubs have been observed to take > + * over 1 second to update their payload allocations under certain > + * conditions, we use a rather large timeout value. > + */ > + const int timeout_ms = 3000; > + int ret, status; > + > + ret = readx_poll_timeout(do_get_act_status, mgr->aux, status, > + status & DP_PAYLOAD_ACT_HANDLED || status < 0, > + 100, timeout_ms * USEC_PER_MSEC); In v1 the usleep range was 100 -> 1000, in v2 it's going to be 51 -> 100. Perhaps bump this up to 200? > + if (ret < 0 && status >= 0) { > + DRM_DEBUG_KMS("Failed to get ACT bit %d after %dms\n", > + status, timeout_ms); I still think status should be base 16 when printed With those nits addressed, Reviewed-by: Sean Paul <sean@xxxxxxxxxx> > return -EINVAL; > + } else if (status < 0) { > + DRM_DEBUG_KMS("Failed to read payload table status: %d\n", > + status); > + return status; > } > + > return 0; > } > EXPORT_SYMBOL(drm_dp_check_act_status); > -- > 2.25.1 >