On Fri, 21 Oct 2016, Imre Deak <imre.deak@xxxxxxxxx> wrote: > On Fri, 2016-10-21 at 00:40 +0300, Imre Deak wrote: >> On Thu, 2016-10-20 at 23:50 +0300, Jani Nikula wrote: >> > On Thu, 20 Oct 2016, Imre Deak <imre.deak@xxxxxxxxx> wrote: >> > > On Thu, 2016-10-20 at 22:24 +0300, Jani Nikula wrote: >> > > > On Thu, 20 Oct 2016, Jani Nikula <jani.nikula@xxxxxxxxx> wrote: >> > > > > On Thu, 20 Oct 2016, Imre Deak <imre.deak@xxxxxxxxx> wrote: >> > > > > > On my APL the LSPCON firmware resumes in PCON mode as >> > > > > > opposed to the >> > > > > > expected LS mode. It also appears to be in a state where >> > > > > > AUX DPCD reads >> > > > > > will succeed but return garbage recovering only after a few >> > > > > > hundreds of >> > > > > > milliseconds. After the recovery time DPCD reads will >> > > > > > result in the >> > > > > > correct values and things will continue to work. If I2C >> > > > > > over AUX is >> > > > > > attempted during this recovery time (implying an AUX write >> > > > > > transaction) >> > > > > > the firmware won't recover and will stay in this broken >> > > > > > state. >> > > > > > >> > > > > > As a workaround check if the firmware is in PCON state >> > > > > > after resume and >> > > > > > if so wait until the correct DPCD values are returned. For >> > > > > > this we >> > > > > > compare the branch descriptor with the one we cached during >> > > > > > init time. >> > > > > > If the firmware was in the LS state, we skip the w/a and >> > > > > > continue as >> > > > > > before. >> > > > > > >> > > > > > Cc: Shashank Sharma <shashank.sharma@xxxxxxxxx> >> > > > > > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> >> > > > > > Cc: Jani Nikula <jani.nikula@xxxxxxxxx> >> > > > > > Signed-off-by: Imre Deak <imre.deak@xxxxxxxxx> >> > > > > > --- >> > > > > > drivers/gpu/drm/i915/intel_dp.c | 2 +- >> > > > > > drivers/gpu/drm/i915/intel_drv.h | 6 ++++- >> > > > > > drivers/gpu/drm/i915/intel_lspcon.c | 52 >> > > > > > ++++++++++++++++++++++++++++++------- >> > > > > > 3 files changed, 48 insertions(+), 12 deletions(-) >> > > > > > >> > > > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c >> > > > > > b/drivers/gpu/drm/i915/intel_dp.c >> > > > > > index e90211e..ec031db 100644 >> > > > > > --- a/drivers/gpu/drm/i915/intel_dp.c >> > > > > > +++ b/drivers/gpu/drm/i915/intel_dp.c >> > > > > > @@ -3487,7 +3487,7 @@ intel_dp_link_down(struct intel_dp >> > > > > > *intel_dp) >> > > > > > intel_dp->DP = DP; >> > > > > > } >> > > > > > >> > > > > > -static bool >> > > > > > +bool >> > > > > > intel_dp_read_dpcd(struct intel_dp *intel_dp) >> > > > > > { >> > > > > > if (drm_dp_dpcd_read(&intel_dp->aux, 0x000, >> > > > > > intel_dp->dpcd, >> > > > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h >> > > > > > b/drivers/gpu/drm/i915/intel_drv.h >> > > > > > index a35e241..9a2366e 100644 >> > > > > > --- a/drivers/gpu/drm/i915/intel_drv.h >> > > > > > +++ b/drivers/gpu/drm/i915/intel_drv.h >> > > > > > @@ -972,7 +972,9 @@ struct intel_dp { >> > > > > > struct intel_lspcon { >> > > > > > bool active; >> > > > > > enum drm_lspcon_mode mode; >> > > > > > - struct drm_dp_aux *aux; >> > > > > > + struct intel_dp *intel_dp; >> > > > > > + bool desc_valid; >> > > > > > + struct intel_dp_desc desc; >> > > > > >> > > > > I guess we could cache the desc in intel_dp directly. >> > > > > Independent of >> > > > > this patch. >> > > > > >> > > > > Also, I'm wondering if we could stick with just aux here, and >> > > > > read >> > > > > something else from dpcd instead. >> > > > > >> > > > > > }; >> > > > > > >> > > > > > struct intel_digital_port { >> > > > > > @@ -1469,6 +1471,8 @@ static inline unsigned int >> > > > > > intel_dp_unused_lane_mask(int lane_count) >> > > > > > } >> > > > > > >> > > > > > bool >> > > > > > +intel_dp_read_dpcd(struct intel_dp *intel_dp); >> > > > > > +bool >> > > > > > intel_dp_read_desc(struct intel_dp *intel_dp, struct >> > > > > > intel_dp_desc *desc); >> > > > > > void >> > > > > > intel_dp_print_desc(struct intel_dp *intel_dp, struct >> > > > > > intel_dp_desc *desc); >> > > > > > diff --git a/drivers/gpu/drm/i915/intel_lspcon.c >> > > > > > b/drivers/gpu/drm/i915/intel_lspcon.c >> > > > > > index d2c8cb2..54c6173 100644 >> > > > > > --- a/drivers/gpu/drm/i915/intel_lspcon.c >> > > > > > +++ b/drivers/gpu/drm/i915/intel_lspcon.c >> > > > > > @@ -30,7 +30,7 @@ >> > > > > > static enum drm_lspcon_mode lspcon_get_current_mode(struct >> > > > > > intel_lspcon *lspcon) >> > > > > > { >> > > > > > enum drm_lspcon_mode current_mode = >> > > > > > DRM_LSPCON_MODE_INVALID; >> > > > > > - struct i2c_adapter *adapter = &lspcon->aux->ddc; >> > > > > > + struct i2c_adapter *adapter = &lspcon->intel_dp- >> > > > > > >aux.ddc; >> > > > > > >> > > > > > if (drm_lspcon_get_mode(adapter, ¤t_mode)) >> > > > > > DRM_ERROR("Error reading LSPCON mode\n"); >> > > > > > @@ -45,7 +45,7 @@ static int lspcon_change_mode(struct >> > > > > > intel_lspcon *lspcon, >> > > > > > { >> > > > > > int err; >> > > > > > enum drm_lspcon_mode current_mode; >> > > > > > - struct i2c_adapter *adapter = &lspcon->aux->ddc; >> > > > > > + struct i2c_adapter *adapter = &lspcon->intel_dp- >> > > > > > >aux.ddc; >> > > > > > >> > > > > > err = drm_lspcon_get_mode(adapter, ¤t_mode); >> > > > > > if (err) { >> > > > > > @@ -72,7 +72,7 @@ static int lspcon_change_mode(struct >> > > > > > intel_lspcon *lspcon, >> > > > > > static bool lspcon_probe(struct intel_lspcon *lspcon) >> > > > > > { >> > > > > > enum drm_dp_dual_mode_type adaptor_type; >> > > > > > - struct i2c_adapter *adapter = &lspcon->aux->ddc; >> > > > > > + struct i2c_adapter *adapter = &lspcon->intel_dp- >> > > > > > >aux.ddc; >> > > > > > >> > > > > > /* Lets probe the adaptor and check its type */ >> > > > > > adaptor_type = drm_dp_dual_mode_detect(adapter); >> > > > > > @@ -89,8 +89,42 @@ static bool lspcon_probe(struct >> > > > > > intel_lspcon *lspcon) >> > > > > > return true; >> > > > > > } >> > > > > > >> > > > > > +static void lspcon_resume_in_pcon_wa(struct intel_lspcon >> > > > > > *lspcon) >> > > > > > +{ >> > > > > > + unsigned long start = jiffies; >> > > > > > + >> > > > > > + if (!lspcon->desc_valid) >> > > > > > + return; >> > > > > > + >> > > > > > + while (1) { >> > > > > > + struct intel_dp_desc desc; >> > > > > > + >> > > > > > + /* >> > > > > > + * The w/a only applies in PCON mode and >> > > > > > we don't expect any >> > > > > > + * AUX errors. >> > > > > > + */ >> > > > > > + if (!intel_dp_read_desc(lspcon->intel_dp, >> > > > > > &desc)) >> > > > > > + return; >> > > > > > + >> > > > > > + if (!memcmp(&lspcon->desc, &desc, >> > > > > > sizeof(desc))) { >> > > > > > + DRM_DEBUG_KMS("LSPCON recovering >> > > > > > in PCON mode after %u ms\n", >> > > > > > + jiffies_to_msecs(jif >> > > > > > fies - start)); >> > > > > > + return; >> > > > > > + } >> > > > > > + >> > > > > > + if (time_after(jiffies, start + >> > > > > > msecs_to_jiffies(1000))) >> > > > > > + break; >> > > > > > + >> > > > > > + msleep(10); >> > > > > > + } >> > > > > > + >> > > > > > + DRM_DEBUG_KMS("LSPCON DP descriptor mismatch after >> > > > > > resume\n"); >> > > > > > +} >> > > > > >> > > > > I think we need to go through the LSPCON spec once more >> > > > > before we merge >> > > > > this. Maybe we don't reset the LSPCON properly. Maybe we >> > > > > don't handle >> > > > > the resume properly. Maybe this isn't a "workaround" after >> > > > > all. >> > > > >> > > > Looks like this stuff is normal behaviour if the LSPCON has >> > > > been in low >> > > > power state, and we need to wake it up. In PCON mode, this is >> > > > done using >> > > > DP_SET_POWER. And while waking up, it'll issue AUX DEFER, as we >> > > > can see >> > > > from the logs. >> > > >> > > I did think about that, but waking the device via >> > > DP_SET_POWER(D0) >> > > didn't help. Doing any AUX write transactions actually gets the >> > > device >> > > into a stuck state for good (until next reboot on APL). >> > > >> > > More over, we don't get AUX DEFERs the AUX transactions succeed, >> > > it's >> > > only that reads will return corrupted values for the given time >> > > period. >> > >> > The bug at hand has native defers >> > https://bugs.freedesktop.org/show_bug.cgi?id=98353 >> >> For I2C-over-AUX I also get defers, but not for DPCD transactions for >> native AUX registers for instance DP_SET_POWER or the OUI, device-ID >> registers. I also tried the LS mode wake-up sequence via >> I2C-over-AUX (offset 0x20), but didn't help either. >> >> The bug above could be a different issue, although I did see this >> particular scenario getting fixed on the same machine see >> /archive/results/CI_IGT_test/Patchwork_2778/ > > I tried this now on the fi-skl-6700hq machine in the bug report and it > seems to be the same failure mode as the APL one. There is one > difference in that on the SKL machine once the firmware is in the stuck > state after suspend/resume, a warm reboot may not recover it, so during > the next boot the LSPCON probe will fail. I guess that during warm > reboot LSPCON may continue to be powered, so to fully reset it a power > cycle is needed. In any case this patch fixes resume on that machine > too, so I couldn't reproduce the problem with it even across multiple > reboots, whereas without it it's 100% reproducible. A quick thought, when we do module unload or suspend, perhaps we should put the LSPCON to the state in which we expect to find it on probe/resume? In case it doesn't get reset. BR, Jani. > > --Imre -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx