Quoting Chris Wilson (2018-01-24 14:44:01) > Quoting Tvrtko Ursulin (2018-01-24 13:09:37) > > > > On 22/01/2018 15:41, Chris Wilson wrote: > > > If we remember to cancel the signaler on a request when retiring it > > > (after we know that the request has been signaled), we do not need to > > > carry an additional request in the signaler itself. This prevents an > > > issue whereby the signaler threads may be delayed and hold on to > > > thousands of request references, causing severe memory fragmentation and > > > premature oom (most noticeable on 32b snb due to the limited GFP_KERNEL > > > and frequent use of inter-engine fences). > > > > What is starving the signaler thread, which is set to SCHED_FIFO, and > > can't be tasklets on SNB? > > Interrupts. MI_USER_INTERRUPT to be precise, but we have to check all > the other sources on snb as well. > > > Before I actually start revieweing the code, which I'd rather avoid :) : > > > > Is it just not able to process enough requests in it's time-slice > > (need_resched) so is falling behind? It would be surprising since I > > would expect it to be much lighter wait processing there, per request, > > than on the submission paths. > > The conclusion is a bit odd, but more or less it's just a pathological > case where interrupts + rt task are contending for one cpu with > submission proceeding on another. Making the signaler lighter was the > intention of the rest of the series, but this patch by itself prevents > the runaway references. Whilst I'm thinking of this, when I hit oom on snb, there were ~3 million requests allocated (/proc/slabinfo) but only ~3 in-flight. Tracing the request references gave the clue that the only outstanding ones were in the signaler (there were only 2 sources of references, one for the active request and one for the signaler; and we accounted for the active request knowing that they were being retired). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx