This change is Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com> On 2017-08-22 09:52 PM, Alex Deucher wrote: > KIQ doesn't really use the GPU scheduler. The base > drivers generally use the KIQ ring directly rather than > submitting IBs. However, amdgpu_sched_hw_submission > (which defaults to 2) limits the number of outstanding > fences to 2. KFD uses the KIQ for TLB flushes and the > 2 fence limit hurts performance when there are several KFD > processes running. > > v2: move some expressions to one line > change KIQ sched_hw_submission to at least 16 > > Signed-off-by: Alex Deucher <alexander.deucher at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > index 6c5646b..7c251ff 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c > @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, > unsigned irq_type) > { > int r; > + int sched_hw_submission = amdgpu_sched_hw_submission; > + > + /* Set the hw submission limit higher for KIQ because > + * it's used for a number of gfx/compute tasks by both > + * KFD and KGD which may have outstanding fences and > + * it doesn't really use the gpu scheduler anyway; > + * KIQ tasks get submitted directly to the ring. > + */ > + if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) > + sched_hw_submission = max(sched_hw_submission, 16); > > if (ring->adev == NULL) { > if (adev->num_rings >= AMDGPU_MAX_RINGS) > @@ -178,8 +188,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, > ring->adev = adev; > ring->idx = adev->num_rings++; > adev->rings[ring->idx] = ring; > - r = amdgpu_fence_driver_init_ring(ring, > - amdgpu_sched_hw_submission); > + r = amdgpu_fence_driver_init_ring(ring, sched_hw_submission); > if (r) > return r; > } > @@ -218,8 +227,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, > return r; > } > > - ring->ring_size = roundup_pow_of_two(max_dw * 4 * > - amdgpu_sched_hw_submission); > + ring->ring_size = roundup_pow_of_two(max_dw * 4 * sched_hw_submission); > > ring->buf_mask = (ring->ring_size / 4) - 1; > ring->ptr_mask = ring->funcs->support_64bit_ptrs ?