On 09/10/2015 08:41, Chris Wilson wrote:
On Thu, Oct 08, 2015 at 07:31:38PM +0100, Tomas Elf wrote:
Error state capture is dependent on i915_gem_active_request() and
i915_gem_obj_is_pinned(). Since there is no synchronization between error state
capture and the driver state we at least need to use safe list iterators in
these functions to alleviate the problem of request/vma lists changing during
error state capture.
Does not make it safe.
-Chris
I know it doesn't make it safe, I didn't say this patch makes it safe.
Maybe I was unclear but I chose the word "alleviate" rather than "solve"
to indicate that there are still problems but this change reduces the
problem scope and makes crashes less likely to occur. I also used the
formulation "at least" to indicate that we're not solving everything but
we can do certain things to improve things somewhat.
The problems I've been seeing has been that the list state changes
during iteration and that the error capture tries to read elements that
are no longer part of the list - not that elements that the error
capture code is dereferencing are deallocated by the driver or whatever.
Using a safe iterator helps with that very particular problem. Or maybe
I guess I've just been incredibly lucky for the last 2 months when I've
been running this code as I've been able to get 12+ hours of stability
during my tests instead of less than one hour in between crashes that
was the case before I introduced these changes.
Thanks,
Tomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx