Quoting Tvrtko Ursulin (2019-04-16 10:48:05) > > On 16/04/2019 09:10, Chris Wilson wrote: > > Sometimes the HW doesn't even play fair, and completely forgets about > > register writes. Skip verifying known troublemakers. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=108954 > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_workarounds.c | 40 ++++++++++++++----- > > .../gpu/drm/i915/intel_workarounds_types.h | 7 ++-- > > 2 files changed, 33 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c > > index db99f2e676bb..ba58be05f58c 100644 > > --- a/drivers/gpu/drm/i915/intel_workarounds.c > > +++ b/drivers/gpu/drm/i915/intel_workarounds.c > > @@ -122,6 +122,7 @@ static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa) > > wal->wa_count++; > > wa_->val |= wa->val; > > wa_->mask |= wa->mask; > > + wa_->read |= wa->read; > > return; > > } > > } > > @@ -146,9 +147,10 @@ wa_write_masked_or(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, > > u32 val) > > { > > struct i915_wa wa = { > > - .reg = reg, > > + .reg = reg, > > .mask = mask, > > - .val = val > > + .val = val, > > + .read = mask, > > }; > > > > _wa_add(wal, &wa); > > @@ -172,6 +174,19 @@ wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 val) > > wa_write_masked_or(wal, reg, val, val); > > } > > > > +static void > > +ignore_wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 mask, u32 val) > > +{ > > + struct i915_wa wa = { > > + .reg = reg, > > + .mask = mask, > > + .val = val, > > + /* Bonkers HW, skip verifying */ > > + }; > > + > > + _wa_add(wal, &wa); > > +} > > + > > #define WA_SET_BIT_MASKED(addr, mask) \ > > wa_write_masked_or(wal, (addr), (mask), _MASKED_BIT_ENABLE(mask)) > > > > @@ -916,10 +931,11 @@ wal_get_fw_for_rmw(struct intel_uncore *uncore, const struct i915_wa_list *wal) > > static bool > > wa_verify(const struct i915_wa *wa, u32 cur, const char *name, const char *from) > > { > > - if ((cur ^ wa->val) & wa->mask) { > > + if ((cur ^ wa->val) & wa->read) { > > DRM_ERROR("%s workaround lost on %s! (%x=%x/%x, expected %x, mask=%x)\n", > > - name, from, i915_mmio_reg_offset(wa->reg), cur, > > - cur & wa->mask, wa->val, wa->mask); > > + name, from, i915_mmio_reg_offset(wa->reg), > > + cur, cur & wa->read, > > + wa->val, wa->mask); > > > > return false; > > } > > @@ -1122,9 +1138,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) > > _3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE); > > > > /* WaPipelineFlushCoherentLines:icl */ > > - wa_write_or(wal, > > - GEN8_L3SQCREG4, > > - GEN8_LQSC_FLUSH_COHERENT_LINES); > > + ignore_wa_write_or(wal, > > + GEN8_L3SQCREG4, > > + GEN8_LQSC_FLUSH_COHERENT_LINES, > > + GEN8_LQSC_FLUSH_COHERENT_LINES); > > > > /* > > * Wa_1405543622:icl > > @@ -1151,9 +1168,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) > > * Wa_1405733216:icl > > * Formerly known as WaDisableCleanEvicts > > */ > > - wa_write_or(wal, > > - GEN8_L3SQCREG4, > > - GEN11_LQSC_CLEAN_EVICT_DISABLE); > > + ignore_wa_write_or(wal, > > + GEN8_L3SQCREG4, > > + GEN11_LQSC_CLEAN_EVICT_DISABLE, > > + GEN11_LQSC_CLEAN_EVICT_DISABLE); > > > > /* WaForwardProgressSoftReset:icl */ > > wa_write_or(wal, > > diff --git a/drivers/gpu/drm/i915/intel_workarounds_types.h b/drivers/gpu/drm/i915/intel_workarounds_types.h > > index 30918da180ff..42ac1fb99572 100644 > > --- a/drivers/gpu/drm/i915/intel_workarounds_types.h > > +++ b/drivers/gpu/drm/i915/intel_workarounds_types.h > > @@ -12,9 +12,10 @@ > > #include "i915_reg.h" > > > > struct i915_wa { > > - i915_reg_t reg; > > - u32 mask; > > - u32 val; > > + i915_reg_t reg; > > + u32 mask; > > + u32 val; > > + u32 read; > > }; > > > > struct i915_wa_list { > > > > The sporadic nature of failures worries me here. What is better: > > a) Stop verifying from the driver and potentially lose a workaround > and/or never figure out what is actually going on. > > b) Have CI buglog track this as a known issue "forever"? Can't our > automated tooling handle this? Either way, it's NOTOURBUG. If it's not clear, my intent is that this is recognised as a HW issue. The problem with letting it reside in the test suite is that it is hard for humans to discern an expected failure from a new failure (and if it hard for us to tell at a glance, I fully expect a generic over-reaching catch all filter to be applied in cibuglog), so I anticipate that this test becomes useless if left to cibuglog -- plus it will remain forever in the issues.html. Hence marking up known failures, with one hopes an HSDES. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx