On Fri, Jun 19, 2015 at 05:30:45PM +0100, Chris Wilson wrote: > On Thu, Jun 18, 2015 at 04:58:06PM +0200, Daniel Vetter wrote: > > On Thu, Jun 18, 2015 at 12:42:55PM +0100, Chris Wilson wrote: > > > I understand the merit in trying the reset a few times before giving up, > > > it would just need a bit of restructuring to try the reset before > > > clearing gem state (trivial) and requeueing the hangcheck. I am just > > > wary of feature creep before we get stuck into TDR, which promises to > > > change how we think about resets entirely. > > > > My maintainer concern here is always that we should err on the side of not > > killing the machine. If the reset failed, or if the gpu reinit failed then > > marking the gpu as wedged has historically been the safe option. The > > system will still run, display mostly works and there's a reasonable > > chance you can gather debug data. > > One thing to bear in mind here is that it with this particular don't > reset if not ready logic, repeating the attempt at reset after another > hangcheck is equivalent to just using a slower hangcheck. (more or less, > a couple of writes to one register difference) So it is no more likely > to hang the machine than the original GPU hang. > > We can differentiate the cases here, between say EBUSY, ENODEV, and EIO, > from the actual the reset request to determine which we want to retry > (i.e. EBUSY). Tbh I don't want to make the reset code to clever with multiple fallback paths - it's a really tricky code and as-is already suffers from imo insufficient test coverage and too many bugs. Once we decided that the gpu is dead and return -EIO this should be a terminal state. Developers can always manually unwedge through debugfs, but for users it's imo paramount that we don't automatically run some little-tested path and take down their box in the process. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx