On 2017-03-16 11:13 AM, Andres Rodriguez wrote: > > > On 2017-03-16 05:15 AM, zhoucm1 wrote: >> >> >> On 2017å¹´03æ??16æ?¥ 17:10, Christian König wrote: >>> Am 16.03.2017 um 10:00 schrieb Chunming Zhou: >>>> if high priority rq is full, then process with low priority could be >>>> starve. >>>> Add policy for this problem, the high proiority can ahead of next >>>> priority queue, >>>> the ratio is 2 : 1. >>>> >>>> Change-Id: I58f4a6b9cdce8689b18dd8e83dd6e2cf5f99d5fb >>>> Signed-off-by: Chunming Zhou <David1.Zhou at amd.com> >>> >>> Well, the idea behind the high priority queues is to actually starve >>> the low priority queues to a certain amount. >>> >>> At least for the kernel queue that is really desired. >> Yes, agree, but we certainly don't want low priority queue is totally >> dead, which doesn't have chance to run if high priority queue is always >> full and busy. >> If without Andres changes, it doesn't matter. But after Andres changes >> upstream, we need scheduling policy. >> >> Regards, >> David Zhou > > Hey David, > > The desired behavior is actually starvation. This is why allocating a > high priority context is locked behind root privileges at the moment. > > High priority work includes many different features intended for the > safety of the user wearing the headset. These include: > - stopping the user from tripping over furniture > - preventing motion sickness > > We cannot compromise these safety features in order to run non-critical > tasks. > > The current approach also has the disadvantage of heavily interleaving > work. This is going to have two undesired side effects. First, there > will be a performance overhead from having the queue context switch so > often. > > Second, this approach improves concurrency of work in a ring with mixed > priority work, but actually decreases overall system concurrency. When a > ring's HW queue has any high priority work committed, the whole ring > will be executing at high priority. So interleaving the work will result > in extending the time a ring spends in high priority mode. This > effectively extends time that a ring might spend with CU's reserved > which will have a performance impact on other rings. > > The best approach here is to make sure we don't map high priority work > and regular priority work to the same ring. This is why the LRU policy > for userspace ring ids to kernel ring ids was introduced. This policy > provides the guarantee as a side effect. Wanted to correct myself here. The LRU policy doesn't actually provide a guarantee. It approximates it. Regards, Andres > But further work there could be > useful to actually check context priorities when doing the ring mapping. > > By placing work on different rings, we let the hardware actually > dispatch work according to wave parameters like VGPR usage, etc. Trying > to deduce all this information in the GPU scheduler will get very > complicated. > > This is NAK'd by me. > > Regards, > Andres > > >>> >>> Christian. >>> >>>> --- >>>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 26 >>>> +++++++++++++++++++++++--- >>>> drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 2 ++ >>>> 2 files changed, 25 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>> index 0f439dd..4637b6f 100644 >>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c >>>> @@ -35,11 +35,16 @@ >>>> static void amd_sched_process_job(struct fence *f, struct fence_cb >>>> *cb); >>>> /* Initialize a given run queue struct */ >>>> -static void amd_sched_rq_init(struct amd_sched_rq *rq) >>>> +static void amd_sched_rq_init(struct amd_gpu_scheduler *sched, enum >>>> + amd_sched_priority pri) >>>> { >>>> + struct amd_sched_rq *rq = &sched->sched_rq[pri]; >>>> + >>>> spin_lock_init(&rq->lock); >>>> INIT_LIST_HEAD(&rq->entities); >>>> rq->current_entity = NULL; >>>> + rq->wait_base = pri * 2; >>>> + rq->wait = rq->wait_base; >>>> } >>>> static void amd_sched_rq_add_entity(struct amd_sched_rq *rq, >>>> @@ -494,17 +499,32 @@ static void amd_sched_wakeup(struct >>>> amd_gpu_scheduler *sched) >>>> { >>>> struct amd_sched_entity *entity; >>>> int i; >>>> + bool skip; >>>> if (!amd_sched_ready(sched)) >>>> return NULL; >>>> +retry: >>>> + skip = false; >>>> /* Kernel run queue has higher priority than normal run queue*/ >>>> for (i = AMD_SCHED_PRIORITY_MAX - 1; i >= >>>> AMD_SCHED_PRIORITY_MIN; i--) { >>>> + if ((i > AMD_SCHED_PRIORITY_MIN) && >>>> + (sched->sched_rq[i - 1].wait >= >>>> sched->sched_rq[i].wait_base)) { >>>> + sched->sched_rq[i - 1].wait = sched->sched_rq[i - >>>> 1].wait_base; >>>> + skip = true; >>>> + continue; >>>> + } >>>> entity = amd_sched_rq_select_entity(&sched->sched_rq[i]); >>>> - if (entity) >>>> + if (entity) { >>>> + if (i > AMD_SCHED_PRIORITY_MIN) >>>> + sched->sched_rq[i - 1].wait++; >>>> break; >>>> + } >>>> } >>>> + if (!entity && skip) >>>> + goto retry; >>>> + >>>> return entity; >>>> } >>>> @@ -608,7 +628,7 @@ int amd_sched_init(struct amd_gpu_scheduler >>>> *sched, >>>> sched->name = name; >>>> sched->timeout = timeout; >>>> for (i = AMD_SCHED_PRIORITY_MIN; i < AMD_SCHED_PRIORITY_MAX; i++) >>>> - amd_sched_rq_init(&sched->sched_rq[i]); >>>> + amd_sched_rq_init(sched, i); >>>> init_waitqueue_head(&sched->wake_up_worker); >>>> init_waitqueue_head(&sched->job_scheduled); >>>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h >>>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h >>>> index 99f0240..4caed30 100644 >>>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h >>>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h >>>> @@ -64,6 +64,8 @@ struct amd_sched_rq { >>>> spinlock_t lock; >>>> struct list_head entities; >>>> struct amd_sched_entity *current_entity; >>>> + int wait_base; >>>> + int wait; >>>> }; >>>> struct amd_sched_fence { >>> >>> >>