Re: [PATCH] drm/i915: Mitigate retirement starvation a bit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 04/02/16 12:40, Chris Wilson wrote:
On Thu, Feb 04, 2016 at 12:25:24PM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

In execlists mode internal house keeping of the discarded
requests (and so contexts and VMAs) relies solely on the retire
worker, which can be prevented from running by just being
unlucky when busy clients are hammering on the big lock.

Prime example is the gem_close_race IGT, which due to this
effect causes internal lists to grow to epic proportions, with
a consequece of object VMA traversal to growing exponentially
and resulting in tens of minutes test runtime. Memory use is
also very high and a limiting factor on some platforms.

Since we do not want to run this internal house keeping more
frequently, due concerns that it may affect performance, and
the scenario being statistically not very likely in real
workloads, one possible workaround is to run it when new
client handles are opened.

This will solve the issues with this particular test case,
making it complete in tens of seconds instead of tens of
minutes, and will not add any run-time penalty to running
clients.

It can only slightly slow down new client startup, but on a
realisticaly loaded system we are expecting this to be not
significant. Even with heavy rendering in progress we can have
perhaps up to several thousands of requests pending retirement,
which, with a typical retirement cost of 80ns to 1us per
request, is not significant.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Testcase: igt/gem_close_race/gem-close-race
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

Still doesn't fix actual workloads where this is demonstrably bad, which
can be demonstrated with a single fd.

Which are those?

The most effective treatment I found is moving the retire-requests from
execbuf (which exists for similar reasons) to get-pages.

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=breadcrumbs&id=75f4e53f1c9141ba2dd8847396a1bcc8dbeecd55

I struggle to understand how it is OK to stall get pages or even the object close when you objected to those in the past?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux