Quoting Tvrtko Ursulin (2019-01-24 17:55:20) > > On 21/01/2019 22:21, Chris Wilson wrote: > > Missed breadcrumb detection is defunct due to the tight coupling with > > How it is defunct.. oh because there is no irq_count any more... could > you describe the transition from irq_count to irq_fired and then to > nothing briefly? We don't have an independent intel_wait to distinguish irq completions from dma_fence_signals. Everytime we call dma_fence_signal() we think we saw an interrupt, and we complete fences very often before we see interrupts... And then our test completely fails to setup the machine to detect the missed breadcrumb as the requests get completed by anything but the missed breadcrumb timer. > > dma_fence signaling and the myriad ways we may signal fences from > > everywhere but from an interrupt, i.e. we frequently signal a fence > > before we even see its interrupt. This means that even if we miss an > > interrupt for a fence, it still is signaled before our breadcrumb > > hangcheck fires, so simplify the breadcrumb hangchecking by moving it > > into the GPU hangcheck and forgo fake interrupts. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_debugfs.c | 93 ----------- > > drivers/gpu/drm/i915/i915_gpu_error.c | 2 - > > drivers/gpu/drm/i915/i915_gpu_error.h | 5 - > > drivers/gpu/drm/i915/intel_breadcrumbs.c | 147 +----------------- > > drivers/gpu/drm/i915/intel_hangcheck.c | 2 + > > drivers/gpu/drm/i915/intel_ringbuffer.h | 5 - > > .../gpu/drm/i915/selftests/igt_live_test.c | 7 - > > 7 files changed, 5 insertions(+), 256 deletions(-) > > With this balance of insertions and deletions I cannot decide if this is > easy or hard to review. > > IGT uses intel_detect_and_clear_missed_interrupts a lot. What is the > plan there? They are defunct. They no longer detect anything useful from the previous patch, this just makes it official. igt has been prepped for the loss of the debugfs. Without this patch we get false positives from i915_missed_interrupt instead. I've tried and failed to replace the detection in an acceptable manner, without a separate irq completion tracker it seems hopeless. > > diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c > > index 5662d6fed523..a219c796e56d 100644 > > --- a/drivers/gpu/drm/i915/intel_hangcheck.c > > +++ b/drivers/gpu/drm/i915/intel_hangcheck.c > > @@ -275,6 +275,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > > for_each_engine(engine, dev_priv, id) { > > struct hangcheck hc; > > > > + intel_engine_signal_breadcrumbs(engine); > > + > > Sounds fine. So only downside is detecting missed interrupts gets > slower. But in practice they don't happen often? In practice, no missed interrupts. *touch wood* That was why fixing gen5-gen7 beforehand was so important. Having a signal here and in retire_work, means that in the worst case everything updates once a second. Enough for users to be able to complain. But more than likely every frame update will flush the earlier requests anyway, hence why we couldn't detect missed breadcrumbs in igt in the first place. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx