Quoting Daniele Ceraolo Spurio (2019-09-17 20:45:02) > > > On 9/17/2019 11:57 AM, Chris Wilson wrote: > > Quoting Daniele Ceraolo Spurio (2019-09-17 19:36:35) > >> > >> On 9/17/2019 12:57 AM, Chris Wilson wrote: > >>> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04) > >>>> Our assumption that the we can ask the HW to lock the SFC even if not > >>>> currently in use does not match the HW commitment. The expectation from > >>>> the HW is that SW will not try to lock the SFC if the engine is not > >>>> using it and if we do that the behavior is undefined; on ICL the HW > >>>> ends up to returning the ack and ignoring our lock request, but this is > >>>> not guaranteed and we shouldn't expect it going forward. > >>>> > >>>> Reported-by: Owen Zhang <owen.zhang@xxxxxxxxx> > >>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > >>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > >>>> --- > >>>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine) > >>>> sfc_forced_lock_ack_bit, > >>>> sfc_forced_lock_ack_bit, > >>>> 1000, 0, NULL)) { > >>>> - DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n"); > >>>> + /* did we race the unlock? */ > >>>> + if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit) > >>>> + DRM_ERROR("Wait for SFC forced lock ack failed\n"); > >>> What's our plan if this *ERROR* is ever triggered? > >>> > >>> If it remains do nothing and check the logs on death, then it remains > >>> just a debug splat. If there is a plan to actually do something to > >>> handle the error, do it! > >>> -Chris > >> AFAIU the only thing we can do is escalate to full gpu reset. However, > >> the probability of this failing should be next to non-existent (only one > >> engine can use the SFC at any time so there is no lock contention), so > >> I'm not convinced the fallback is worth the effort. The error is still > >> useful IMO to catch unexpected behavior on new platforms, as it happened > >> in this case with the media team reporting seeing this message on gen12 > >> with the previous behavior. This said, I'm happy to add the extra logic > >> if you believe it is worth it. > > We've see this message on every icl run! > > -Chris > > I've never noticed it, which tests are hitting it? My understanding from > what the HW team said is that on ICL the ack will always come back (even > if it is not part of the "official" SW/HW interface) and the HW tweak > that stops that is a gen12 change. Something else might be wrong is this > is firing off in our ICL CI, also because I don't think we have any test > case that actually uses the SFC, do we? https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6911/fi-icl-u2/igt@i915_selftest@live_hangcheck.html All icl, live_hangcheck or live_reset, for as long as I can remember. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx