On Tue, Jan 19, 2016 at 03:04:09PM +0000, Arun Siluvery wrote: > On 19/01/2016 14:13, Chris Wilson wrote: > >On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: > >>On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: > >>>On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > >>>>On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > >>>>>Pending reset requests are cleared before suspending, they should be picked up > >>>>>after resume when new work is submitted. > >>>>> > >>>>>This is originally added as part of TDR patches for Gen8 from Tomas Elf which > >>>>>are under review, as suggested by Chris this is extracted as a separate patch > >>>>>as it can be useful now. > >>>>> > >>>>>Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > >>>>>Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>>>>Signed-off-by: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > >>>> > >>>>Pulling in the discussion we had from irc: Imo the right approach is to > >>>>simply wait for gpu reset to finish it's job. Since that could in turn > >>>>lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > >>>>that in a loop around gem_idle. And drop dev->struct_mutex in-between. > >>>>E.g. > >>>> > >>>>while (busy) { > >>>> mutex_lock(); > >>>> gpu_idle(); > >>>> mutex_unlock(); > >>>> > >>>> flush_work(reset_work); > >>>>} > >>> > >>>Where does the requirement for gpu_idle come from? If there is a global > >>>reset in progress, it cannot queue a request to flush the work and > >>>waiting on the old results will be skipped. So just wait for the global > >>>reset to complete, i.e. flush_work(). > >> > >>Yes, but the global reset might in turn leave a wrecked gpu behind, or at > >>least a non-idle one. Hence another gpu_idle on top, to make sure. If we > >>change init_hw() of engines to be synchronous then we should have at least > >>a WARN_ON(not_idle_but_i_expected_so()); in there ... > > gpu_error.work is removed in b8d24a06568368076ebd5a858a011699a97bfa42, we git sha1 from your private tree are meaningless in the public. Either link to some git weburl or mailing lists archive link. Thanks, Daniel > are doing reset in hangcheck work itself so I think there is no need to > flush work. > > while (i915_reset_in_progress(gpu_error) && > !i915_terminally_wedged(gpu_error)) { > int ret; > > mutex_lock(&dev->struct_mutex); > ret = i915_gpu_idle(dev); > if (ret) > DRM_ERROR("GPU is in inconsistent state after reset\n"); > mutex_unlock(&dev->struct_mutex); > } > > If the reset is successful we are idle before suspend otherwise in a wedged > state. is this ok? > > regards > Arun > > > > >Does it matter on suspend? We test on resume if the GPU is usable, but > >if we wanted to test on suspend then we should do > > > >flush_work(); > >if (i915_terminally_wedged()) > > /* oh noes */; > >-Chris > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx