On Thu, Nov 15, 2012 at 4:17 PM, Daniel Vetter <daniel.vetter at ffwll.ch> wrote: > We have two important transitions of the wedged state in the current > code: > > - 0 -> 1: This means a hang has been detected, and signals to everyone > that they please get of any locks, so that the reset work item can > do its job. > > - 1 -> 0: The reset handler has completed. > > Now the last transition mixes up two states: "Reset completed and > successful" and "Reset failed". To distinguish these two we do some > tricks with the reset completion, but I simply could not convince > myself that this doesn't race under odd circumstances. > > Hence split this up, and add a new terminal state indicating that the > hw is gone for good. > > Also add explicit #defines for both states, update comments. > > v2: Split out the reset handling bugfix for the throttle ioctl. > > v3: s/tmp/wedged/ sugested by Chris Wilson. Also fixup up a rebase > error which prevented this patch from actually compiling. > > v4: To unify the wedged state with the reset counter, keep the > reset-in-progress state just as a flag. The terminally-wedged state is > now denoted with a big number. > > v5: Add a comment to the reset_counter special values explaining that > WEDGED & RESET_IN_PROGRESS needs to be true for the code to be > correct. > > v6: Fixup logic errors introduced with the wedged+reset_counter > unification. Since WEDGED implies reset-in-progress (in a way we're > terminally stuck in the dead-but-reset-not-completed state), we need > ensure that we check for this everywhere. The specific bug was in > wait_for_error, which would simply have timed out. > > v7: Extract an inline i915_reset_in_progress helper to make the code > more readable. Also annote the reset-in-progress case with an > unlikely, to help the compiler optimize the fastpath. Do the same for > the terminally wedged case with i915_terminally_wedged. > > Signed-Off-by: Daniel Vetter <daniel.vetter at ffwll.ch> Right, so the usage of a wait queue makes the code quite a bit more understandable I had to scratch my head for quite a bit with the x->done poking. I think I'd have love to see the "completion -> wait_queue + 2 reset states" and the "gpu_error.wedge -> gpu_error.reset_counter" transitions as two different patches (well, I did because of the 2 versions sent, so all is good) Reviewed-by: Damien Lespiau <damien.lespiau at intel.com> -- Damien