Re: [PATCH 3/8] drm/i915: Cope with request list state change during error state capture

Daniel Vetter <daniel@xxxxxxxx> · Tue, 13 Oct 2015 13:40:34 +0200



On Fri, Oct 09, 2015 at 12:45:07PM +0100, Tomas Elf wrote:
> On 09/10/2015 09:28, Daniel Vetter wrote:
> >On Thu, Oct 08, 2015 at 07:31:35PM +0100, Tomas Elf wrote:
> >>Since we're not synchronizing the ring request list during error state capture
> >>the request list state might change between the time the corresponding error
> >>request list was allocated and dimensioned to the time when the ring request
> >>list is actually captured into the error state. If this happens, throw a
> >>WARNING and do early exit and be aware that the captured error state might not
> >>be fully reliable.
> >
> >Please don't throw a WARNING since this is expected to occasionally
> >happen. DRM_DEBUG_DRIVER is enough imo.
> >-Daniel
> 
> I still don't see how it could happen without leading to reads of
> unallocated memory. The error state request list has been allocated to a
> certain size equal to num_requests and this loop seems to assume that the
> error state request list maintains the same size as the driver request list,
> which is not the case - leading to crashes, which is how I happened to
> notice it.
> 
> I can obviously remove the warning but are you saying we shouldn't even take
> action if it happens? Such as early exit?

Just remove the WARNING since this is a perfectly expected outcome when
error state capture races against request adding. If you want put a
DRM_DEBUG or similar in there, but WARN_ON for stuff that we expect might
happen isn't good - i915.ko is too noisy already as-is.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx