Re: [PATCH] drm/i915/breadcrumbs: Drop request reference for the signaler thread

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 24 Jan 2018 14:44:01 +0000

Quoting Tvrtko Ursulin (2018-01-24 13:09:37)
> 
> On 22/01/2018 15:41, Chris Wilson wrote:
> > If we remember to cancel the signaler on a request when retiring it
> > (after we know that the request has been signaled), we do not need to
> > carry an additional request in the signaler itself. This prevents an
> > issue whereby the signaler threads may be delayed and hold on to
> > thousands of request references, causing severe memory fragmentation and
> > premature oom (most noticeable on 32b snb due to the limited GFP_KERNEL
> > and frequent use of inter-engine fences).
> 
> What is starving the signaler thread, which is set to SCHED_FIFO, and 
> can't be tasklets on SNB?

Interrupts. MI_USER_INTERRUPT to be precise, but we have to check all
the other sources on snb as well.

> Before I actually start revieweing the code, which I'd rather avoid :) :
> 
> Is it just not able to process enough requests in it's time-slice 
> (need_resched) so is falling behind? It would be surprising since I 
> would expect it to be much lighter wait processing there, per request, 
> than on the submission paths.

The conclusion is a bit odd, but more or less it's just a pathological
case where interrupts + rt task are contending for one cpu with
submission proceeding on another. Making the signaler lighter was the
intention of the rest of the series, but this patch by itself prevents
the runaway references.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx