On Tue, Jun 26, 2012 at 12:52 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote: > On Mon, 25 Jun 2012 23:48:01 +0200, Daniel Vetter <daniel at ffwll.ch> wrote: >> So essentially I still fail to see the upside of your proposed ductape >> ... In either case I guess a walk to the reset button is inevitable >> every once in a while ;-) > > A false positive for declaring a GPU wedged in a situation that should > have never occurred in the first place is a recoverable and minor > inconvenience compared to locking the display and possibly the system up. > > An alternative is to incorporate the deadlock detection into > i915_mutex_lock_interruptible() and make it report -EIO if it waits > longer than 10s, f.e., for the reset to complete. Then the only danger > are the few paths that do not perform the error checking lock. I kinda like this idea - all unconditional mutex_lockers would deadlock in the same way as i915_reset, but if we've managed to sprinkle our special reset aware trylock code at all the right places, at least userspace should get to the -EIO eventually and do something sensible. I guess if someone is indeed hogging dev->struct_mutex somehow (which /should/ be the only thing preventing i915_reset from doing its job) there's not much userspace could actually do - it would inevitably die on the next gtt pagefault. But I guess we can etch out a bit more survivability in corner cases. I'll see what this looks like in actual code tomorrow. Thanks, Daniel -- Daniel Vetter daniel.vetter at ffwll.ch - +41 (0) 79 364 57 48 - http://blog.ffwll.ch