[PATCH 2/7] drm/amdgpu: use scheduler load balancing for SDMA CS

ckoenig.leichtzumerken@xxxxxxxxx (Christian König) · Thu, 2 Aug 2018 12:09:30 +0200

Am 02.08.2018 um 07:50 schrieb Zhang, Jerry (Junwei):
> On 08/01/2018 07:31 PM, Christian KÃ¶nig wrote:
>> Start to use the scheduler load balancing for userspace SDMA
>> command submissions.
>>
>
> In this case, each SDMA could load all SDMA(instances) rqs, and UMD 
> will not specify a ring id.
> If so, we may abstract a set of rings for each type of IP, associated 
> with such kind of IP instances' rq.

That's what my follow up patch set does.

> Accordingly libdrm needs to update as well, but it may be more 
> user-friendly, regardless of ring id when submits command.

No, libdrm and userspace should and must stay as they are. Userspace 
should not notice that we move jobs to another ring in the kernel.

>
> And will it interfere ctx->ring's settings, like sequence?

No, take a look at the patch. There is no ctx->ring any more.

Regards,
Christian.

> e.g. submit a command to SDMA0, but SDMA0 is busy and SDMA1 is idle,
> So the command will be pushed to SDMA1 rq, but update sequence for 
> ctx->ring[SDMA0] actually.
>
>
> Regards,
> Jerry
>
>> Signed-off-by: Christian KÃ¶nig <christian.koenig at amd.com>
>> ---
>> Â  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 25 +++++++++++++++++++++----
>> Â  1 file changed, 21 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index df6965761046..59046f68975a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -48,7 +48,8 @@ static int amdgpu_ctx_init(struct amdgpu_device *adev,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct drm_file *filp,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  struct amdgpu_ctx *ctx)
>> Â  {
>> -Â Â Â  unsigned i, j;
>> +Â Â Â  struct drm_sched_rq *sdma_rqs[AMDGPU_MAX_RINGS];
>> +Â Â Â  unsigned i, j, num_sdma_rqs;
>> Â Â Â Â Â  int r;
>>
>> Â Â Â Â Â  if (priority < 0 || priority >= DRM_SCHED_PRIORITY_MAX)
>> @@ -80,18 +81,34 @@ static int amdgpu_ctx_init(struct amdgpu_device 
>> *adev,
>> Â Â Â Â Â  ctx->init_priority = priority;
>> Â Â Â Â Â  ctx->override_priority = DRM_SCHED_PRIORITY_UNSET;
>>
>> -Â Â Â  /* create context entity for each ring */
>> +Â Â Â  num_sdma_rqs = 0;
>> Â Â Â Â Â  for (i = 0; i < adev->num_rings; i++) {
>> Â Â Â Â Â Â Â Â Â  struct amdgpu_ring *ring = adev->rings[i];
>> Â Â Â Â Â Â Â Â Â  struct drm_sched_rq *rq;
>>
>> Â Â Â Â Â Â Â Â Â  rq = &ring->sched.sched_rq[priority];
>> +Â Â Â Â Â Â Â  if (ring->funcs->type == AMDGPU_RING_TYPE_SDMA)
>> +Â Â Â Â Â Â Â Â Â Â Â  sdma_rqs[num_sdma_rqs++] = rq;
>> +Â Â Â  }
>> +
>> +Â Â Â  /* create context entity for each ring */
>> +Â Â Â  for (i = 0; i < adev->num_rings; i++) {
>> +Â Â Â Â Â Â Â  struct amdgpu_ring *ring = adev->rings[i];
>>
>> Â Â Â Â Â Â Â Â Â  if (ring == &adev->gfx.kiq.ring)
>> Â Â Â Â Â Â Â Â Â Â Â Â Â  continue;
>>
>> -Â Â Â Â Â Â Â  r = drm_sched_entity_init(&ctx->rings[i].entity,
>> -Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  &rq, 1, &ctx->guilty);
>> +Â Â Â Â Â Â Â  if (ring->funcs->type == AMDGPU_RING_TYPE_SDMA) {
>> +Â Â Â Â Â Â Â Â Â Â Â  r = drm_sched_entity_init(&ctx->rings[i].entity,
>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  sdma_rqs, num_sdma_rqs,
>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  &ctx->guilty);
>> +Â Â Â Â Â Â Â  } else {
>> +Â Â Â Â Â Â Â Â Â Â Â  struct drm_sched_rq *rq;
>> +
>> +Â Â Â Â Â Â Â Â Â Â Â  rq = &ring->sched.sched_rq[priority];
>> +Â Â Â Â Â Â Â Â Â Â Â  r = drm_sched_entity_init(&ctx->rings[i].entity,
>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  &rq, 1, &ctx->guilty);
>> +Â Â Â Â Â Â Â  }
>> Â Â Â Â Â Â Â Â Â  if (r)
>> Â Â Â Â Â Â Â Â Â Â Â Â Â  goto failed;
>> Â Â Â Â Â  }
>>