On Thu, Jun 10, 2021 at 03:16:58PM +0530, Tejas Upadhyay wrote: > This reverts commit fb899dd8ea9c4ac5928b86946e6536790981adae. > > w/a on mentioned platforms not working as expected and causing > more harm on the RC6 flow. > > Fixes: fb899dd8ea9c ("drm/i915: Apply Wa_1406680159:icl,ehl as an engine workaround") > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Cc: Matt Roper <matthew.d.roper@xxxxxxxxx> > Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@xxxxxxxxx> NAK. This patch does not do what it claims (it deletes the workaround completely rather than reverting the original patch) and also doesn't address the real issue here. As we've discussed before, this workaround is working properly. We've already confirmed that the register write does go through, the bit is set in hardware, and the relevant clock gating is disabled at the hardware level. The *only* thing that isn't working is the readback verification of the register bit, and this is a multicast steering issue rather than anything to do with this specific workaround (due to "luck" this just happens to be the only documented workaround that updates a register in the affected multicast region on this platform, but you'd see the same type of warnings if other workarounds in the future start touching the same general multicast region). When we write to a multicast register like this, all instances of the register that live behind the same MMIO address get updated. But when we read back those multicast registers, the value we get back always comes from one specific instance. For workarounds, all of the instances will have the same value, so it doesn't matter which instance we read back, as long as we don't read back from an instance that is fused off (if we do, we'll read back a 0). During driver init, we pick a register steering configuration that will make sure reads of multicase registers hit a non-fused-off instance and return a proper value. However on some platforms (including EHL/JSL), there's one more thing to consider. On these platforms, Render Power Gating (RPG) will modify the behavior when the hardware exits RC6 --- instead of all registers being powered back up and accessible when forcewake is grabbed, only a single slice/subslice will be powered up initially if there's no workload to be run (the hardware architects refer to this as the "minconfig"). This means that when reading from multicast registers while RPG is enabled, we need to not only steer toward a slice/subslice that isn't fused off, but we also need to steer toward the specific slice/subslice that is used as the minconfig (which I believe is always the lowest numbered one). At the point we apply and verify workarounds, RPG is not supposed to be active (i.e., we're not at the point of GT setup where we (re)enable it yet). However there seems to be some hardware misbehavior where in certain circumstances a previous write to the A210 control register to disable Render Power Gating did not take effect --- the A210 register value indicates that RPG should be off, but the hardware continues to operate as if it is on, and only minconfig instances of MCR registers get powered up to return valid values. We encounter this problem during certain reset or suspend/resume operations, and that's what causes the workaround warning message here. We're steering our read to a valid (non-fused-off) instance of the multicast register, but since it isn't the specific minconfig slice/subslice we still fail to read back the proper value. If we just want to fix the workaround warning message here, it should be a simple matter of making sure our default steering always targets the minconfig slice/subslice. I believe switching wa_init_mcr() from using fls() to ffs() would be sufficient --- we'd always pick the lowest possible slice/subslice in that case (i.e., the minconfig) instead of the highest. The real question to be answered is why the hardware continues to behave as if RPG is on, even after its been disabled via the POWERGATE_ENABLE register. This may be a hardware bug, or (more likely), there's some step that wasn't properly documented in the bspec so our driver isn't following it (the GT power management documentation in the bspec definitely isn't as clear and complete as we'd like it to be). Matt > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 ----- > 1 file changed, 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index b62d1e31a645..fea7fde30d4a 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -1774,11 +1774,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) > wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2, > PSDUNIT_CLKGATE_DIS); > > - /* Wa_1406680159:icl,ehl */ > - wa_write_or(wal, > - SUBSLICE_UNIT_LEVEL_CLKGATE, > - GWUNIT_CLKGATE_DIS); > - > /* > * Wa_1408767742:icl[a2..forever],ehl[all] > * Wa_1605460711:icl[a0..c0] > -- > 2.31.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx