On Wed, May 25, 2022 at 2:46 AM Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > > > On 24/05/2022 15:50, Rob Clark wrote: > > On Tue, May 24, 2022 at 6:45 AM Tvrtko Ursulin > > <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > >> > >> > >> On 23/05/2022 23:53, Rob Clark wrote: > >>> On Mon, May 23, 2022 at 7:45 AM Tvrtko Ursulin > >>> <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > >>>> > >>>> > >>>> Hi Rob, > >>>> > >>>> On 28/07/2021 02:06, Rob Clark wrote: > >>>>> From: Rob Clark <robdclark@xxxxxxxxxxxx> > >>>>> > >>>>> The drm/scheduler provides additional prioritization on top of that > >>>>> provided by however many number of ringbuffers (each with their own > >>>>> priority level) is supported on a given generation. Expose the > >>>>> additional levels of priority to userspace and map the userspace > >>>>> priority back to ring (first level of priority) and schedular priority > >>>>> (additional priority levels within the ring). > >>>>> > >>>>> Signed-off-by: Rob Clark <robdclark@xxxxxxxxxxxx> > >>>>> Acked-by: Christian König <christian.koenig@xxxxxxx> > >>>>> --- > >>>>> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 4 +- > >>>>> drivers/gpu/drm/msm/msm_gem_submit.c | 4 +- > >>>>> drivers/gpu/drm/msm/msm_gpu.h | 58 ++++++++++++++++++++++++- > >>>>> drivers/gpu/drm/msm/msm_submitqueue.c | 35 +++++++-------- > >>>>> include/uapi/drm/msm_drm.h | 14 +++++- > >>>>> 5 files changed, 88 insertions(+), 27 deletions(-) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c > >>>>> index bad4809b68ef..748665232d29 100644 > >>>>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c > >>>>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c > >>>>> @@ -261,8 +261,8 @@ int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value) > >>>>> return ret; > >>>>> } > >>>>> return -EINVAL; > >>>>> - case MSM_PARAM_NR_RINGS: > >>>>> - *value = gpu->nr_rings; > >>>>> + case MSM_PARAM_PRIORITIES: > >>>>> + *value = gpu->nr_rings * NR_SCHED_PRIORITIES; > >>>>> return 0; > >>>>> case MSM_PARAM_PP_PGTABLE: > >>>>> *value = 0; > >>>>> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c > >>>>> index 450efe59abb5..c2ecec5b11c4 100644 > >>>>> --- a/drivers/gpu/drm/msm/msm_gem_submit.c > >>>>> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c > >>>>> @@ -59,7 +59,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, > >>>>> submit->gpu = gpu; > >>>>> submit->cmd = (void *)&submit->bos[nr_bos]; > >>>>> submit->queue = queue; > >>>>> - submit->ring = gpu->rb[queue->prio]; > >>>>> + submit->ring = gpu->rb[queue->ring_nr]; > >>>>> submit->fault_dumped = false; > >>>>> > >>>>> INIT_LIST_HEAD(&submit->node); > >>>>> @@ -749,7 +749,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, > >>>>> /* Get a unique identifier for the submission for logging purposes */ > >>>>> submitid = atomic_inc_return(&ident) - 1; > >>>>> > >>>>> - ring = gpu->rb[queue->prio]; > >>>>> + ring = gpu->rb[queue->ring_nr]; > >>>>> trace_msm_gpu_submit(pid_nr(pid), ring->id, submitid, > >>>>> args->nr_bos, args->nr_cmds); > >>>>> > >>>>> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h > >>>>> index b912cacaecc0..0e4b45bff2e6 100644 > >>>>> --- a/drivers/gpu/drm/msm/msm_gpu.h > >>>>> +++ b/drivers/gpu/drm/msm/msm_gpu.h > >>>>> @@ -250,6 +250,59 @@ struct msm_gpu_perfcntr { > >>>>> const char *name; > >>>>> }; > >>>>> > >>>>> +/* > >>>>> + * The number of priority levels provided by drm gpu scheduler. The > >>>>> + * DRM_SCHED_PRIORITY_KERNEL priority level is treated specially in some > >>>>> + * cases, so we don't use it (no need for kernel generated jobs). > >>>>> + */ > >>>>> +#define NR_SCHED_PRIORITIES (1 + DRM_SCHED_PRIORITY_HIGH - DRM_SCHED_PRIORITY_MIN) > >>>>> + > >>>>> +/** > >>>>> + * msm_gpu_convert_priority - Map userspace priority to ring # and sched priority > >>>>> + * > >>>>> + * @gpu: the gpu instance > >>>>> + * @prio: the userspace priority level > >>>>> + * @ring_nr: [out] the ringbuffer the userspace priority maps to > >>>>> + * @sched_prio: [out] the gpu scheduler priority level which the userspace > >>>>> + * priority maps to > >>>>> + * > >>>>> + * With drm/scheduler providing it's own level of prioritization, our total > >>>>> + * number of available priority levels is (nr_rings * NR_SCHED_PRIORITIES). > >>>>> + * Each ring is associated with it's own scheduler instance. However, our > >>>>> + * UABI is that lower numerical values are higher priority. So mapping the > >>>>> + * single userspace priority level into ring_nr and sched_prio takes some > >>>>> + * care. The userspace provided priority (when a submitqueue is created) > >>>>> + * is mapped to ring nr and scheduler priority as such: > >>>>> + * > >>>>> + * ring_nr = userspace_prio / NR_SCHED_PRIORITIES > >>>>> + * sched_prio = NR_SCHED_PRIORITIES - > >>>>> + * (userspace_prio % NR_SCHED_PRIORITIES) - 1 > >>>>> + * > >>>>> + * This allows generations without preemption (nr_rings==1) to have some > >>>>> + * amount of prioritization, and provides more priority levels for gens > >>>>> + * that do have preemption. > >>>> > >>>> I am exploring how different drivers handle priority levels and this > >>>> caught my eye. > >>>> > >>>> Is the implication of the last paragraphs that on hw with nr_rings > 1, > >>>> ring + 1 preempts ring? > >>> > >>> Other way around, at least from the uabi standpoint. Ie. ring[0] > >>> preempts ring[1] > >> > >> Ah yes, I figure it out from the comments but then confused myself when > >> writing the email. > >> > >>>> If so I am wondering does the "spreading" of > >>>> user visible priorities by NR_SCHED_PRIORITIES creates a non-preemptable > >>>> levels within every "bucket" or how does that work? > >>> > >>> So, preemption is possible between any priority level before run_job() > >>> gets called, which writes the job into the ringbuffer. After that > >> > >> Hmm how? Before run_job() the jobs are not runnable, sitting in the > >> scheduler queues, right? > > > > I mean, if prio[0]+prio[1]+prio[2] map to a single ring, submit A on > > prio[1] could be executed after submit B on prio[2] provided that > > run_job(submitA) hasn't happened yet. So I guess it isn't "really" > > preemption because the submit hasn't started running on the GPU yet. > > But rather just scheduling according to priority. > > > >>> point, you only have "bucket" level preemption, because > >>> NR_SCHED_PRIORITIES levels of priority get mapped to a single FIFO > >>> ringbuffer. > >> > >> Right, and you have one GPU with four rings, which means you expose 12 > >> priority levels to userspace, did I get that right? > > > > Correct > > > >> If so how do you convey in the ABI that not all there priority levels > >> are equal? Like userspace can submit at prio 4 and expect prio 3 to > >> preempt, as would prio 2 preempt prio 3. While actual behaviour will not > >> match - 3 will not preempt 4. > > > > It isn't really exposed to userspace, but perhaps it should be.. > > Userspace just knows that, to the extent possible, the kernel will try > > to execute prio 3 before prio 4. > > > >> Also, does your userspace stack (EGL/Vulkan) use the priorities? I had a > >> quick peek in Mesa but did not spot it - although I am not really at > >> home there yet so maybe I missed it. > > > > Yes, there is an EGL extension: > > > > https://www.khronos.org/registry/EGL/extensions/IMG/EGL_IMG_context_priority.txt > > > > It is pretty limited, it only exposes three priority levels. > > Right, is that wired up on msm? And if it is, or could be, how do/would > you map the three priority levels for GPUs which expose 3 priority > levels versus the one which exposes 12? We don't yet, but probably should, expose a cap to indicate to userspace the # of hw rings vs # of levels of sched priority > Is it doable properly without leaking the fact drm/sched internal > implementation detail of three priority levels? Or if you went the other > way and only exposed up to max 3 levels, then you lose one priority > level your hardware suppose which is also not good. > > It is all quite interesting because your hardware is completely > different from ours in this respect. In our case i915 decides when to > preempt, hardware has no concept of priority (*). It is really pretty much all in firmware.. a6xx is the first gen that could do actual (non-cooperative) preemption (but that isn't implemented yet in upstream driver) BR, -R > Regards, > > Tvrtko > > (*) Almost no concept of priority in hardware - we do have it on new > GPUs and only on a subset of engine classes where render and compute > share the EUs. But I think it's way different from Ardenos.