On Mon, Feb 01, 2016 at 11:00:08AM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Where objects are shared across contexts and heavy rendering > is in progress, execlist retired request queue will grow > unbound until the GPU is idle enough for the retire worker > to run and call intel_execlists_retire_requests. > > With some workloads, like for example gem_close_race, that > never happens causing the shared object VMA list to grow to > epic proportions, and in turn causes retirement call sites to > spend linearly more and more time walking the obj->vma_list. > > End result is the above mentioned test case taking ten minutes > to complete and using up more than a GiB of RAM just for the VMA > objects. > > If we instead trigger the execlist house keeping a bit more > often, obj->vma_list will be kept in check by the virtue of > context cleanup running and zapping the inactive VMAs. > > This makes the test case an order of magnitude faster and brings > memory use back to normal. > > This also makes the code more self-contained since the > intel_execlists_retire_requests call-site is now in a more > appropriate place and implementation leakage is somewhat > reduced. However, this then causes a perf regression since we unpin the contexts too frequently and do not have any mitigation in place yet. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx