Am 22.08.2017 um 23:34 schrieb Jay Cornwall: > On Tue, Aug 22, 2017, at 16:17, Felix Kuehling wrote: >> Thanks Alex! >> >> Jay, do you think this is enough? This bumps the number of concurrent >> operations on KIQ to 4 by default. > I'm not sure what the best number is. Up to 8 KFD processes is common > (beyond that performance drops off due to VMID availability) but I'm not > sure how often they would need to submit to KIQ concurrently. If it's > not expensive I'd just bump it up to say 16. Well we allocate an array of pointers as ring buffer for the fences. So I would say lets set this to 256, cause 256*number_of_entries_per_hw submision*number_of_bytes_for_a_pointer=4096. This way we use up exactly one page for the fence array. Regards, Christian. > > The performance problem isn't that bad since all the KIQ requests are > serialized but the dmesg spam is not nice. Perhaps lowering the severity > of the 'rcu slot is busy' message would address that as well? > >> Regards, >> Felix >> >> >> On 2017-08-22 04:49 PM, Alex Deucher wrote: >>> KIQ doesn't really use the GPU scheduler. The base >>> drivers generally use the KIQ ring directly rather than >>> submitting IBs. However, amdgpu_sched_hw_submission >>> (which defaults to 2) limits the number of outstanding >>> fences to 2. KFD uses the KIQ for TLB flushes and the >>> 2 fence limit hurts performance when there are several KFD >>> processes running. >>> >>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 ++++++++++++-- >>> 1 file changed, 12 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c >>> index 6c5646b..f39b851 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c >>> @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, >>> unsigned irq_type) >>> { >>> int r; >>> + int sched_hw_submission = amdgpu_sched_hw_submission; >>> + >>> + /* Set the hw submission limit higher for KIQ because >>> + * it's used for a number of gfx/compute tasks by both >>> + * KFD and KGD which may have outstanding fences and >>> + * it doesn't really use the gpu scheduler anyway; >>> + * KIQ tasks get submitted directly to the ring. >>> + */ >>> + if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) >>> + sched_hw_submission *= 2; >>> >>> if (ring->adev == NULL) { >>> if (adev->num_rings >= AMDGPU_MAX_RINGS) >>> @@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, >>> ring->idx = adev->num_rings++; >>> adev->rings[ring->idx] = ring; >>> r = amdgpu_fence_driver_init_ring(ring, >>> - amdgpu_sched_hw_submission); >>> + sched_hw_submission); >>> if (r) >>> return r; >>> } >>> @@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, >>> } >>> >>> ring->ring_size = roundup_pow_of_two(max_dw * 4 * >>> - amdgpu_sched_hw_submission); >>> + sched_hw_submission); >>> >>> ring->buf_mask = (ring->ring_size / 4) - 1; >>> ring->ptr_mask = ring->funcs->support_64bit_ptrs ? >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx