Re: [PATCH 2/2] tests/gem_eio: Resilience against "hanging too fast"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 26, 2015 at 12:34:35PM +0100, Daniel Vetter wrote:
> Since $debugfs/i915_wedged restores a wedged gpu by using a normal gpu
> hang we need to be careful to not run into the "hanging too fast
> check":
> 
> - don't restore the ban period, but instead keep it at 0.
> - make sure we idle the gpu fully before hanging it again (wait
>   subtest missted that).
> 
> With this gem_eio works now reliable even when I don't run the
> subtests individually.
> 
> Of course it's a bit fishy that the default ctx gets blamed for
> essentially doing nothing, but until that's figured out in upstream
> it's better to make the test work for now.

This used to be reliable. And just disabling all banning in the kernel
forever more is silly.

During igt_post_hang_ring:
1. we wait upon the hanging batch
 - this returns when hangcheck fires
2. reset the ban period to normal
 - this takes mutex_lock_interruptible and so must wait for the reset
   handler to run before it can make the change,
 - ergo the hanging batch never triggers a ban for itself.
 - (a subsequent nonsimulated GPU hang may trigger the ban though)

Nak.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux