Re: [PATCH v2 1/3] drm/i915: Drop racy markup of missed-irqs from idle-worker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 21/07/16 12:04, Chris Wilson wrote:
On Thu, Jul 21, 2016 at 11:28:02AM +0100, Tvrtko Ursulin wrote:
On 21/07/16 11:10, Chris Wilson wrote:
On Thu, Jul 21, 2016 at 10:58:05AM +0100, Tvrtko Ursulin wrote:

On 21/07/16 07:57, Chris Wilson wrote:
During the idle-worker we disable the hangcheck and so kick any waiters
that should have been completed (since the GPU is now idle). Unlike the
hangcheck, we do not take any care to avoid the race between the irq
handler and ourselves, and so it is possible for us to declare a missed
interrupt even as the bottom-half is being scheduled to run. Let's
ignore this race to stop a potential false-positive error.

If the bottom half is scheduled to run then then..

References: https://bugs.freedesktop.org/show_bug.cgi?id=96974
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gem.c | 7 +++----
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 40047eb48826..9e826585edb2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2706,10 +2706,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
  	rearm_hangcheck = false;

  	stuck_engines = intel_kick_waiters(dev_priv);

... this will not return a stucked engine since the there is a bh
task assigned all until the bh exits.

It reports if it wakes up a waiter on any engine. If the bh is already
running, we cannot know if it has missed the seqno update. If it isn't
running yet, we cannot know if it is about to be run.

Oh I read the logic as completely opposite than what it is.

Since the idle worker runs 100ms after last retirement, that would
mean a really slow waiter or what?

It is dubious. But the idle worker runs 100ms after the first time we
detect all engines are idle and may be running as we detect all engines
are idle again. The only thing for sure is that in some cases that bdw-u
is reaching the idle-worker with an unwoken engine (and that there is
a race here in declaring it as a missed interrupt). I wasn't that
concerned about the race because of that 100ms delay where eveything
should have been idle, but on reflection that 100ms is not guarranteed.

Would canceling the idle worker be to expensive?

Either way, looks OK to me.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Regards,

Tvrtko



_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux