[PATCH 2/2] drm/amd/sched: add schuduling policy

deathsimple@xxxxxxxxxxx (Christian König) · Thu, 16 Mar 2017 17:01:56 +0100

Am 16.03.2017 um 16:31 schrieb Andres Rodriguez:
>
>
> On 2017-03-16 11:13 AM, Andres Rodriguez wrote:
>>
>>
>> On 2017-03-16 05:15 AM, zhoucm1 wrote:
>>>
>>>
>>> On 2017å¹´03æ??16æ?¥ 17:10, Christian KÃ¶nig wrote:
>>>> Am 16.03.2017 um 10:00 schrieb Chunming Zhou:
>>>>> if high priority rq is full, then process with low priority could be
>>>>> starve.
>>>>> Add policy for this problem, the high proiority can ahead of next
>>>>> priority queue,
>>>>> the ratio is 2 : 1.
>>>>>
>>>>> Change-Id: I58f4a6b9cdce8689b18dd8e83dd6e2cf5f99d5fb
>>>>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com>
>>>>
>>>> Well, the idea behind the high priority queues is to actually starve
>>>> the low priority queues to a certain amount.
>>>>
>>>> At least for the kernel queue that is really desired.
>>> Yes, agree, but we certainly don't want  low priority queue is totally
>>> dead, which doesn't have chance to run if high priority queue is always
>>> full and busy.
>>> If without Andres changes, it doesn't matter. But after Andres changes
>>> upstream, we need scheduling policy.
>>>
>>> Regards,
>>> David Zhou
>>
>> Hey David,
>>
>> The desired behavior is actually starvation. This is why allocating a
>> high priority context is locked behind root privileges at the moment.

Thanks for the quick response, and yes that's what I was expecting as 
requirement for that feature as well.

>>
>> High priority work includes many different features intended for the
>> safety of the user wearing the headset. These include:
>>     - stopping the user from tripping over furniture
>>     - preventing motion sickness
>>
>> We cannot compromise these safety features in order to run non-critical
>> tasks.
>>
>> The current approach also has the disadvantage of heavily interleaving
>> work. This is going to have two undesired side effects. First, there
>> will be a performance overhead from having the queue context switch so
>> often.
>>
>> Second, this approach improves concurrency of work in a ring with mixed
>> priority work, but actually decreases overall system concurrency. When a
>> ring's HW queue has any high priority work committed, the whole ring
>> will be executing at high priority. So interleaving the work will result
>> in extending the time a ring spends in high priority mode. This
>> effectively extends time that a ring might spend with CU's reserved
>> which will have a performance impact on other rings.
>>
>> The best approach here is to make sure we don't map high priority work
>> and regular priority work to the same ring. This is why the LRU policy
>> for userspace ring ids to kernel ring ids was introduced. This policy
>> provides the guarantee as a side effect.
>
> Wanted to correct myself here. The LRU policy doesn't actually provide 
> a guarantee. It approximates it.

What David is perfectly right with is that this has the potential for 
undesired side effects.

But I completely agree that deadline or fair queue scheduling is tricky 
to implement even when it is desired.

So letting a submission dominate all other is perfectly ok for me as 
long as it is limited to a certain set of processes somehow.

Christian.

>
> Regards,
> Andres
>
>> But further work there could be
>> useful to actually check context priorities when doing the ring mapping.
>>
>> By placing work on different rings, we let the hardware actually
>> dispatch work according to wave parameters like VGPR usage, etc. Trying
>> to deduce all this information in the GPU scheduler will get very
>> complicated.
>>
>> This is NAK'd by me.
>>
>> Regards,
>> Andres
>>
>>
>>>>
>>>> Christian.
>>>>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 26
>>>>> +++++++++++++++++++++++---
>>>>>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.h |  2 ++
>>>>>   2 files changed, 25 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>>>> index 0f439dd..4637b6f 100644
>>>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>>>>> @@ -35,11 +35,16 @@
>>>>>   static void amd_sched_process_job(struct fence *f, struct fence_cb
>>>>> *cb);
>>>>>     /* Initialize a given run queue struct */
>>>>> -static void amd_sched_rq_init(struct amd_sched_rq *rq)
>>>>> +static void amd_sched_rq_init(struct amd_gpu_scheduler *sched, enum
>>>>> +                  amd_sched_priority pri)
>>>>>   {
>>>>> +    struct amd_sched_rq *rq = &sched->sched_rq[pri];
>>>>> +
>>>>>       spin_lock_init(&rq->lock);
>>>>>       INIT_LIST_HEAD(&rq->entities);
>>>>>       rq->current_entity = NULL;
>>>>> +    rq->wait_base = pri * 2;
>>>>> +    rq->wait = rq->wait_base;
>>>>>   }
>>>>>     static void amd_sched_rq_add_entity(struct amd_sched_rq *rq,
>>>>> @@ -494,17 +499,32 @@ static void amd_sched_wakeup(struct
>>>>> amd_gpu_scheduler *sched)
>>>>>   {
>>>>>       struct amd_sched_entity *entity;
>>>>>       int i;
>>>>> +    bool skip;
>>>>>         if (!amd_sched_ready(sched))
>>>>>           return NULL;
>>>>>   +retry:
>>>>> +    skip = false;
>>>>>       /* Kernel run queue has higher priority than normal run queue*/
>>>>>       for (i = AMD_SCHED_PRIORITY_MAX - 1; i >=
>>>>> AMD_SCHED_PRIORITY_MIN; i--) {
>>>>> +        if ((i > AMD_SCHED_PRIORITY_MIN) &&
>>>>> +            (sched->sched_rq[i - 1].wait >=
>>>>> sched->sched_rq[i].wait_base)) {
>>>>> +            sched->sched_rq[i - 1].wait = sched->sched_rq[i -
>>>>> 1].wait_base;
>>>>> +            skip = true;
>>>>> +            continue;
>>>>> +        }
>>>>>           entity = amd_sched_rq_select_entity(&sched->sched_rq[i]);
>>>>> -        if (entity)
>>>>> +        if (entity) {
>>>>> +            if (i > AMD_SCHED_PRIORITY_MIN)
>>>>> +                sched->sched_rq[i - 1].wait++;
>>>>>               break;
>>>>> +        }
>>>>>       }
>>>>>   +    if (!entity && skip)
>>>>> +        goto retry;
>>>>> +
>>>>>       return entity;
>>>>>   }
>>>>>   @@ -608,7 +628,7 @@ int amd_sched_init(struct amd_gpu_scheduler
>>>>> *sched,
>>>>>       sched->name = name;
>>>>>       sched->timeout = timeout;
>>>>>       for (i = AMD_SCHED_PRIORITY_MIN; i < AMD_SCHED_PRIORITY_MAX; 
>>>>> i++)
>>>>> -        amd_sched_rq_init(&sched->sched_rq[i]);
>>>>> +        amd_sched_rq_init(sched, i);
>>>>> init_waitqueue_head(&sched->wake_up_worker);
>>>>>       init_waitqueue_head(&sched->job_scheduled);
>>>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>>>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>>>>> index 99f0240..4caed30 100644
>>>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>>>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>>>>> @@ -64,6 +64,8 @@ struct amd_sched_rq {
>>>>>       spinlock_t        lock;
>>>>>       struct list_head    entities;
>>>>>       struct amd_sched_entity    *current_entity;
>>>>> +    int wait_base;
>>>>> +    int wait;
>>>>>   };
>>>>>     struct amd_sched_fence {
>>>>
>>>>
>>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx