On Wed, Jan 20, 2016 at 11:49:51AM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > On Tue, Jan 19, 2016 at 09:50:08PM +0200, Mika Kuoppala wrote: > >> Sometimes we get dmesg warnings claiming that DC6 was already > >> enabled prior to our enabling. Investigations using readback of > >> the written dc state value indicate that sometimes when we disable > >> the dc6, the write doesn't stick, or it does but for a very short time. > >> > >> This leads to state keeping confusion between firmware and driver, > >> driver thinking that dc6 is off and proceeds with modesets > >> while firmware still keeps/thinks dc6 is on. Evidence from logs > >> suggests that the dc6 is still on as flips start to timeout, > >> leading to eventual gpu hang. > >> > >> Further complication is that to recover the hang, we reset while > >> the DC6 is enabled. With this state on the reinit of the gpu side > >> won't come out clean and rings rehang right after the init. > >> > >> Harden the detection of this situation and immediately disable > >> DC6 if we disagree on state with the firmware. This should make it > >> less probable for driver to do end up in a wrong conclusion and > >> let things like modeset and gpu reset proceed with dc6 still > >> enabled. > >> > >> References: https://bugs.freedesktop.org/show_bug.cgi?id=93768 > >> Suggested-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Patrik Jakobsson <patrik.jakobsson@xxxxxxxxxxxxxxx> > >> Cc: Imre Deak <imre.deak@xxxxxxxxx> > >> Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/intel_runtime_pm.c | 31 +++++++++++++++++++++++++++---- > >> 1 file changed, 27 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c > >> index bbca527184d0..5a21453e38e1 100644 > >> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c > >> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c > >> @@ -470,9 +470,23 @@ static void gen9_set_dc_state_debugmask_memory_up( > >> } > >> } > >> > >> +static int gen9_dc_state_verified(struct drm_i915_private *dev_priv, > >> + u32 mask, u32 state) > >> +{ > >> + int i; > >> + > >> + /* Observe the dc state with handful of reads */ > >> + for (i = 0; i < 3; i++) { > >> + if (wait_for((I915_READ(DC_STATE_EN) & mask) == state, 100)) > >> + return false; > >> + } > > > > Why two loops of reads? > > > > I once saw the write stick but for a very short time, so with one wait_for we > might miss it. > > > Since this is equivalent to > > return wait_for((I915_READ(DC_STATE_EN) & mask) == state, 300) == 0; > > > > In this case > > > > WRITE(DC_STATE_EN, val & ~mask | state); > > if (wait_for((READ(DC_STATE_END) & mask) = state, 300) { > > /* oh noes */ > > } > > > > is more obvious (since we verify with a read of the same register, not > > some magic ack sequence). > > I could open code the 2 checks here, would be more obvious yes. > > Btw what led you to think that the state doesn't stick? Looking > at the code for correctness and then deducting that if that warn > happens, there is no other explanation? Yes. It has also been used in the past as a form of acknowledging state changes, though usually it is a separate bit (i.e. a request bit and a current state bit). -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx