Add support for high priority scheduling in amdgpu

andresx7@xxxxxxxxx (Andres Rodriguez) · Wed, 1 Mar 2017 11:37:12 -0500



On 3/1/2017 11:14 AM, Bridgman, John wrote:
> In patch "drm/amdgpu: implement ring set_priority for gfx_v8 compute" can you remind me why you are only passing pipe and not queue to vi_srbm_select() ?
>
> +static void gfx_v8_0_ring_set_priority_compute(struct amdgpu_ring *ring,
> +					       int priority)
> +{
> +	struct amdgpu_device *adev = ring->adev;
> +
> +	if (ring->hw_ip != AMDGPU_HW_IP_COMPUTE)
> +		return;
> +
> +	mutex_lock(&adev->srbm_mutex);
> +	vi_srbm_select(adev, ring->me, ring->pipe, 0, 0);
That's a dumb mistake on my part. Probably got lucky that I was hitting 
queue 0 and also rebooting between tests.

Regards,
Andres

>
>
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On Behalf Of
>> Andres Rodriguez
>> Sent: Tuesday, February 28, 2017 5:14 PM
>> To: amd-gfx at lists.freedesktop.org
>> Subject: Add support for high priority scheduling in amdgpu
>>
>> This patch series introduces a mechanism that allows users with sufficient
>> privileges to categorize their work as "high priority". A userspace app can
>> create a high priority amdgpu context, where any work submitted to this
>> context will receive preferential treatment over any other work.
>>
>> High priority contexts will be scheduled ahead of other contexts by the sw gpu
>> scheduler. This functionality is generic for all HW blocks.
>>
>> Optionally, a ring can implement a set_priority() function that allows
>> programming HW specific features to elevate a ring's priority.
>>
>> This patch series implements set_priority() for gfx8 compute rings. It takes
>> advantage of SPI scheduling and CU reservation to provide improved frame
>> latencies for high priority contexts.
>>
>> For compute + compute scenarios we get near perfect scheduling latency. E.g.
>> one high priority ComputeParticles + one low priority ComputeParticles:
>>     - High priority ComputeParticles: 2.0-2.6 ms/frame
>>     - Regular ComputeParticles: 35.2-68.5 ms/frame
>>
>> For compute + gfx scenarios the high priority compute application does
>> experience some latency variance. However, the variance has smaller bounds
>> and a smalled deviation then without high priority scheduling.
>>
>> Following is a graph of the frame time experienced by a high priority compute
>> app in 4 different scenarios to exemplify the compute + gfx latency variance:
>>     - ComputeParticles: this scenario invloves running the compute particles
>>       sample on its own.
>>     - +SSAO: Previous scenario with the addition of running the ssao sample
>>       application that clogs the GFX ring with constant work.
>>     - +SPI Priority: Previous scenario with the addition of SPI priority
>>       programming for compute rings.
>>     - +CU Reserve: Previous scenario with the addition of dynamic CU
>>       reservation for compute rings.
>>
>> Graph link:
>> https://plot.ly/~lostgoat/9/
>>
>> As seen above, high priority contexts for compute allow us to schedule work
>> with enhanced confidence of completion latency under high GPU loads. This
>> property will be important for VR reprojection workloads.
>>
>> Note: The first part of this series is a resend of "Change queue/pipe split
>> between amdkfd and amdgpu" with the following changes:
>>     - Fixed kfdtest on Kaveri due to shift overflow. Refer to: "drm/amdkfdallow
>>       split HQD on per-queue granularity v3"
>>     - Used Felix's suggestions for a simplified HQD programming sequence
>>     - Added a workaround for a Tonga HW bug during HQD programming
>>
>> This series is also available at:
>> https://github.com/lostgoat/linux/tree/wip-high-priority
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx