On Thu, Feb 04, 2016 at 01:30:30PM +0000, Tvrtko Ursulin wrote: > > > On 04/02/16 12:40, Chris Wilson wrote: > >On Thu, Feb 04, 2016 at 12:25:24PM +0000, Tvrtko Ursulin wrote: > >>From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> > >>In execlists mode internal house keeping of the discarded > >>requests (and so contexts and VMAs) relies solely on the retire > >>worker, which can be prevented from running by just being > >>unlucky when busy clients are hammering on the big lock. > >> > >>Prime example is the gem_close_race IGT, which due to this > >>effect causes internal lists to grow to epic proportions, with > >>a consequece of object VMA traversal to growing exponentially > >>and resulting in tens of minutes test runtime. Memory use is > >>also very high and a limiting factor on some platforms. > >> > >>Since we do not want to run this internal house keeping more > >>frequently, due concerns that it may affect performance, and > >>the scenario being statistically not very likely in real > >>workloads, one possible workaround is to run it when new > >>client handles are opened. > >> > >>This will solve the issues with this particular test case, > >>making it complete in tens of seconds instead of tens of > >>minutes, and will not add any run-time penalty to running > >>clients. > >> > >>It can only slightly slow down new client startup, but on a > >>realisticaly loaded system we are expecting this to be not > >>significant. Even with heavy rendering in progress we can have > >>perhaps up to several thousands of requests pending retirement, > >>which, with a typical retirement cost of 80ns to 1us per > >>request, is not significant. > >> > >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>Testcase: igt/gem_close_race/gem-close-race > >>Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > >Still doesn't fix actual workloads where this is demonstrably bad, which > >can be demonstrated with a single fd. > > Which are those? OglDrvCtx and clones. > >The most effective treatment I found is moving the retire-requests from > >execbuf (which exists for similar reasons) to get-pages. > > > >http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=breadcrumbs&id=75f4e53f1c9141ba2dd8847396a1bcc8dbeecd55 > > I struggle to understand how it is OK to stall get pages or even the > object close when you objected to those in the past? Benchmarks. Taking a hit here avoids situations that end up invoking the shrinker. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx