On Fri, Mar 19, 2021 at 04:44:26PM -0400, Lyude Paul wrote: > > > > [...] > > > > I think it would work if we can make the retries configurable and set it > > > > to > > > > retries = total_timeout / platform_specific_timeout_per_retry > > > > > > > > where total_timeout would be something reasonable like 1 sec. > > > > > > I actually think I'm more open to the idea of configurable retries after > > > learning that apparently this is a thing that the i2c subsystem does - so > > > there's more precedence for it in the rest of the kernel than I originally > > > thought. > > > > > > I'm still curious if we need these extra retries in here though - there seems > > > to > > > be one set of retries that is actually platform specific, and then just a > > > random > > > set of 5 retries that don't seem to have anything to do with platform specific > > > behavior - so I think it'd still be worth giving a shot at getting rid of that > > > > The platform specific part of the timeout is the one desctibed in the > > maximum timeout values comments. > > You mean the > > /* Must try at least 3 times according to DP spec */ > for (try = 0; try < 5; try++) { > > bit? I thought that wasn't related to platform specific retries at all, since > the code in that loop seems to only reference parts of the DP spec, and that the > > while ((aux_clock_divider = intel_dp->get_aux_clock_divider(intel_dp, clock++))) { > > Loop was the portion that was platform specific, since it prompts the driver to > retry the transaction with different aux clock divider rates depending on the > platform in use. Feel free to correct me if I'm wrong though. Nope. I meant every HW transaction will have a platform specific timeout. For instance it's 1.6ms on SKL, but 4ms on ICL. So now since the overall retry count is 32 * 5 = 160, on SKL we'll retry for ~2.6 seconds, on ICL we'll retry for ~6.4 seconds (disregarding now the extra 400usec delay inserted by drm_dp_dpcd_access(), which adds a fixed ~1.3ms delay). This is what I think should be normalized, so that we have the same amount of overall maximum timeout period on all platforms. > Also - with the timeouts we're seeing, does the LTTPR return NAKs at all? That's > still another thing I had suggested alternate workarounds for so that we could > terminate transactions immediately on NAKs, so I wonder if that could save time > here as well. There's not much LTTPR specific in that wrt. what sinks would do normally (no NAKs for read, only for writes) except LTTPRs may rewrite NAKs to ACKs to account for buggy monitors returning NAKs when reading the 0xf0000 -> range. But I'd suggest not dealing with this aspect now, just sanitize the above retry thing, as you suggested, remove the i915 retry loop and make the drm retry loop configurable. (In any case I also had the idea to stop transactions early when HPD gets deasserted, but not sure if that's completely robust.) > > > > > Thanks > > > > > Khaled > > > > > > > > > > > > > > > > > > > Anyways, this seems about the only thing we can do given the > > > > > > > > limited > > > > > > > > hw capabilities. > > > > > > > > Reviewed-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > > > > > > > > > > > > Accordingly disable LTTPR detection until GLK, where the > > > > > > > > > maximum timeout > > > > > > > > > we can set is only 1.6ms. > > > > > > > > > > > > > > > > > > Link training in the non-transparent mode is known to fail at > > > > > > > > > least on > > > > > > > > > some SKL systems with a WD19 dock on the link, which exposes an > > > > > > > > > LTTPR > > > > > > > > > (see the References below). While this could have different > > > > > > > > > reasons > > > > > > > > > besides the too short AUX timeout used, not detecting LTTPRs > > > > > > > > > (and so not > > > > > > > > > using the non-transparent LT mode) fixes link training on these > > > > > > > > > systems. > > > > > > > > > > > > > > > > > > While at it add a code comment about the platform specific > > > > > > > > > maximum > > > > > > > > > timeout values. > > > > > > > > > > > > > > > > > > v2: Add a comment about the g4x maximum timeout as well. > > > > > > > > > (Ville) > > > > > > > > > > > > > > > > > > Reported-by: Takashi Iwai <tiwai@xxxxxxx> > > > > > > > > > Reported-and-tested-by: Santiago Zarate < > > > > > > > > > santiago.zarate@xxxxxxxx> > > > > > > > > > Reported-and-tested-by: Bodo Graumann <mail@xxxxxxxxxxxxxxx> > > > > > > > > > References: > > > > > > > > > https://gitlab.freedesktop.org/drm/intel/-/issues/3166 > > > > > > > > > Fixes: b30edfd8d0b4 ("drm/i915: Switch to LTTPR non-transparent > > > > > > > > > mode link training") > > > > > > > > > Cc: <stable@xxxxxxxxxxxxxxx> # v5.11 > > > > > > > > > Cc: Takashi Iwai <tiwai@xxxxxxx> > > > > > > > > > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > > > > Signed-off-by: Imre Deak <imre.deak@xxxxxxxxx> > > > > > > > > > --- > > > > > > > > > drivers/gpu/drm/i915/display/intel_dp_aux.c | 7 +++++++ > > > > > > > > > .../gpu/drm/i915/display/intel_dp_link_training.c | 15 > > > > > > > > > ++++++++++++--- > > > > > > > > > 2 files changed, 19 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux.c > > > > > > > > > b/drivers/gpu/drm/i915/display/intel_dp_aux.c > > > > > > > > > index eaebf123310a..10fe17b7280d 100644 > > > > > > > > > --- a/drivers/gpu/drm/i915/display/intel_dp_aux.c > > > > > > > > > +++ b/drivers/gpu/drm/i915/display/intel_dp_aux.c > > > > > > > > > @@ -133,6 +133,7 @@ static u32 g4x_get_aux_send_ctl(struct > > > > > > > > > intel_dp *intel_dp, > > > > > > > > > else > > > > > > > > > precharge = 5; > > > > > > > > > > > > > > > > > > +/* Max timeout value on G4x-BDW: 1.6ms */ > > > > > > > > > if (IS_BROADWELL(dev_priv)) > > > > > > > > > timeout = DP_AUX_CH_CTL_TIME_OUT_600us; > > > > > > > > > else > > > > > > > > > @@ -159,6 +160,12 @@ static u32 skl_get_aux_send_ctl(struct > > > > > > > > > intel_dp *intel_dp, > > > > > > > > > enum phy phy = intel_port_to_phy(i915, dig_port- > > > > > > > > > > base.port); > > > > > > > > > u32 ret; > > > > > > > > > > > > > > > > > > +/* > > > > > > > > > + * Max timeout values: > > > > > > > > > + * SKL-GLK: 1.6ms > > > > > > > > > + * CNL: 3.2ms > > > > > > > > > + * ICL+: 4ms > > > > > > > > > + */ > > > > > > > > > ret = DP_AUX_CH_CTL_SEND_BUSY | > > > > > > > > > DP_AUX_CH_CTL_DONE | > > > > > > > > > DP_AUX_CH_CTL_INTERRUPT | > > > > > > > > > diff --git > > > > > > > > > a/drivers/gpu/drm/i915/display/intel_dp_link_training.c > > > > > > > > > b/drivers/gpu/drm/i915/display/intel_dp_link_training.c > > > > > > > > > index 19ba7c7cbaab..c0e25c75c105 100644 > > > > > > > > > --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c > > > > > > > > > +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c > > > > > > > > > @@ -82,6 +82,18 @@ static void > > > > > > > > > intel_dp_read_lttpr_phy_caps(struct intel_dp *intel_dp, > > > > > > > > > > > > > > > > > > static bool intel_dp_read_lttpr_common_caps(struct intel_dp > > > > > > > > > *intel_dp) > > > > > > > > > { > > > > > > > > > +struct drm_i915_private *i915 = dp_to_i915(intel_dp); > > > > > > > > > + > > > > > > > > > +if (intel_dp_is_edp(intel_dp)) > > > > > > > > > +return false; > > > > > > > > > + > > > > > > > > > +/* > > > > > > > > > + * Detecting LTTPRs must be avoided on platforms with > > > > > > > > > an AUX timeout > > > > > > > > > + * period < 3.2ms. (see DP Standard v2.0, 2.11.2, > > > > > > > > > 3.6.6.1). > > > > > > > > > + */ > > > > > > > > > +if (INTEL_GEN(i915) < 10) > > > > > > > > > +return false; > > > > > > > > > + > > > > > > > > > if (drm_dp_read_lttpr_common_caps(&intel_dp->aux, > > > > > > > > > intel_dp- > > > > > > > > > > lttpr_common_caps) < 0) { > > > > > > > > > memset(intel_dp->lttpr_common_caps, 0, > > > > > > > > > @@ -127,9 +139,6 @@ int intel_dp_lttpr_init(struct intel_dp > > > > > > > > > *intel_dp) > > > > > > > > > bool ret; > > > > > > > > > int i; > > > > > > > > > > > > > > > > > > -if (intel_dp_is_edp(intel_dp)) > > > > > > > > > -return 0; > > > > > > > > > - > > > > > > > > > ret = intel_dp_read_lttpr_common_caps(intel_dp); > > > > > > > > > if (!ret) > > > > > > > > > return 0; > > > > > > > > > -- > > > > > > > > > 2.25.1 > > > > > > > > > > > > > > > > -- > > > > > > > > Ville Syrjälä > > > > > > > > Intel > > > > > > > > > > -- > > > Sincerely, > > > Lyude Paul (she/her) > > > Software Engineer at Red Hat > > > > > > Note: I deal with a lot of emails and have a lot of bugs on my plate. If > > > you've > > > asked me a question, are waiting for a review/merge on a patch, etc. and I > > > haven't responded in a while, please feel free to send me another email to > > > check > > > on my status. I don't bite! > > > > > > > -- > Sincerely, > Lyude Paul (she/her) > Software Engineer at Red Hat > > Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've > asked me a question, are waiting for a review/merge on a patch, etc. and I > haven't responded in a while, please feel free to send me another email to check > on my status. I don't bite! >