On 08/04/2022 16:10, Daniel Vetter wrote:
On Fri, 8 Apr 2022 at 12:29, Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
On 08/04/2022 10:50, Dave Airlie wrote:
On Fri, 8 Apr 2022 at 18:25, Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
On 08/04/2022 08:58, Daniel Vetter wrote:
On Thu, Apr 07, 2022 at 04:16:27PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Inherit submitter nice at point of request submission to account for long
running processes getting either externally or self re-niced.
This accounts for the current processing landscape where computational
pipelines are composed of CPU and GPU parts working in tandem.
Nice value will only apply to requests which originate from user contexts
and have default context priority. This is to avoid disturbing any
application made choices of low and high (batch processing and latency
sensitive compositing). In this case nice value adjusts the effective
priority in the narrow band of -19 to +20 around
I915_CONTEXT_DEFAULT_PRIORITY.
This means that userspace using the context priority uapi directly has a
wider range of possible adjustments (in practice that only applies to
execlists platforms - with GuC there are only three priority buckets), but
in all cases nice adjustment has the expected effect: positive nice
lowering the scheduling priority and negative nice raising it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
I don't think adding any more fancy features to i915-scheduler makes
sense, at least not before we've cut over to drm/sched.
Why do you think so?
Drm/sched has at least low/normal/high priority and surely we will keep
the i915 context priority ABI.
Then this patch is not touching the i915 scheduler at all, neither it is
fancy.
The cover letter explains how it implements the same approach as the IO
scheduler. And it explains the reasoning and benefits. Provides an user
experience benefit today, which can easily be preserved.
won't this cause uAPI divergence between execlists and GuC, like if
something nices to -15 or -18 with execlists and the same with GuC it
won't get the same sort of result will it?
Not sure what you consider new ABI divergence but the general problem
space of execlists vs GuC priority handling is not related to this patch.
It 100% is.
Mesa only uses 3 priority levels, which means the 1k execlist levels
(or whatever it was) nonsense has not left the barn and we can get it
back in.
This here bakes it in forever as implicit uapi.
Could you please explain what exactly you see baking into uapi? The fact
user gets the ability to control GPU time distribution? The granularity
of it by observing say difference between nice 5 and nice 6? Something else?
I maintain the uapi did not in any case provide any statements on the
latter, so I still don't see a problem there.
Regards,
Tvrtko
Existing GEM context ABI has -1023 - +1023 for user priorities while GuC
maps that to low/normal/high only. I915_CONTEXT_DEFAULT_PRIORITY is zero
which maps to GuC normal. Negatives map to GuC low and positives to
high. Drm/sched is I understand similar or the same.
So any userspace using the existing uapi can already observe differences
between GuC and execlists. With your example of -15 vs -18 I mean.
I don't think anyone considered that a problem because execution order
based on priority is not a hard guarantee. Neither is proportionality of
timeslicing. Otherwise GuC would already be breaking the ABI.
With this patch it simply allows external control - whereas before only
applications could change their priorities, now users can influence the
priority of the ones which did not bother to set a non-default priority.
In the case of GuC if user says "nice 10 churn-my-dataset-on-gpu &&
run-my-game", former part get low prio, latter gets normal. I don't see
any issues there. Same as if the "churn-my-dataset-on-gpu" command
implemented a command line switch which passed context priority to i915
via the existing GEM context param ioctl.
I've described the exact experiments in both modes in the cover letter
which shows it works. (Ignoring the GuC scheduling quirk where
apparently low-vs-normal timeslices worse than normal-vs-high).
Guc is not breaking anything because the _real_ uapi only has 3 levels
(plus one for kernel stuff on top).
-Daniel