[PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 22.08.2017 um 23:34 schrieb Jay Cornwall:
> On Tue, Aug 22, 2017, at 16:17, Felix Kuehling wrote:
>> Thanks Alex!
>>
>> Jay, do you think this is enough? This bumps the number of concurrent
>> operations on KIQ to 4 by default.
> I'm not sure what the best number is. Up to 8 KFD processes is common
> (beyond that performance drops off due to VMID availability) but I'm not
> sure how often they would need to submit to KIQ concurrently. If it's
> not expensive I'd just bump it up to say 16.

Well we allocate an array of pointers as ring buffer for the fences.

So I would say lets set this to 256, cause 256*number_of_entries_per_hw 
submision*number_of_bytes_for_a_pointer=4096.

This way we use up exactly one page for the fence array.

Regards,
Christian.

>
> The performance problem isn't that bad since all the KIQ requests are
> serialized but the dmesg spam is not nice. Perhaps lowering the severity
> of the 'rcu slot is busy' message would address that as well?
>
>> Regards,
>>    Felix
>>
>>
>> On 2017-08-22 04:49 PM, Alex Deucher wrote:
>>> KIQ doesn't really use the GPU scheduler.  The base
>>> drivers generally use the KIQ ring directly rather than
>>> submitting IBs.  However, amdgpu_sched_hw_submission
>>> (which defaults to 2) limits the number of outstanding
>>> fences to 2.  KFD uses the KIQ for TLB flushes and the
>>> 2 fence limit hurts performance when there are several KFD
>>> processes running.
>>>
>>> Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 ++++++++++++--
>>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 6c5646b..f39b851 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>>   		     unsigned irq_type)
>>>   {
>>>   	int r;
>>> +	int sched_hw_submission = amdgpu_sched_hw_submission;
>>> +
>>> +	/* Set the hw submission limit higher for KIQ because
>>> +	 * it's used for a number of gfx/compute tasks by both
>>> +	 * KFD and KGD which may have outstanding fences and
>>> +	 * it doesn't really use the gpu scheduler anyway;
>>> +	 * KIQ tasks get submitted directly to the ring.
>>> +	 */
>>> +	if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
>>> +		sched_hw_submission *= 2;
>>>   
>>>   	if (ring->adev == NULL) {
>>>   		if (adev->num_rings >= AMDGPU_MAX_RINGS)
>>> @@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>>   		ring->idx = adev->num_rings++;
>>>   		adev->rings[ring->idx] = ring;
>>>   		r = amdgpu_fence_driver_init_ring(ring,
>>> -			amdgpu_sched_hw_submission);
>>> +						  sched_hw_submission);
>>>   		if (r)
>>>   			return r;
>>>   	}
>>> @@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
>>>   	}
>>>   
>>>   	ring->ring_size = roundup_pow_of_two(max_dw * 4 *
>>> -					     amdgpu_sched_hw_submission);
>>> +					     sched_hw_submission);
>>>   
>>>   	ring->buf_mask = (ring->ring_size / 4) - 1;
>>>   	ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux