Re: [RFC 0/5] Add capacity key to fdinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 30, 2024 at 1:27 PM Tvrtko Ursulin <tursulin@xxxxxxxxxx> wrote:
>
> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>
>
> I have noticed AMD GPUs can have more than one "engine" (ring?) of the same type
> but amdgpu is not reporting that in fdinfo using the capacity engine tag.
>
> This series is therefore an attempt to improve that, but only an RFC since it is
> quite likely I got stuff wrong on the first attempt. Or if not wrong it may not
> be very beneficial in AMDs case.
>
> So I tried to figure out how to count and store the number of instances of an
> "engine" type and spotted that could perhaps be used in more than one place in
> the driver. I was more than a little bit confused by the ip_instance and uapi
> rings, then how rings are selected to context entities internally. Anyway..
> hopefully it is a simple enough series to easily spot any such large misses.
>
> End result should be that, assuming two "engine" instances, one fully loaded and
> one idle will only report client using 50% of that engine type.

That would only be true if there are multiple instantiations of the IP
on the chip which in most cases is not true.  In most cases there is
one instance of the IP that can be fed from multiple rings.  E.g. for
graphics and compute, all of the rings ultimately feed into the same
compute units on the chip.  So if you have a gfx ring and a compute
rings, you can schedule work to them asynchronously, but ultimately
whether they execute serially or in parallel depends on the actual
shader code in the command buffers and the extent to which it can
utilize the available compute units in the shader cores.

As for the UAPI portion of this, we generally expose a limited number
of rings to user space and then we use the GPU scheduler to load
balance between all of the available rings of a type to try and
extract as much parallelism as we can.

Alex


>
> Tvrtko Ursulin (5):
>   drm/amdgpu: Cache number of rings per hw ip type
>   drm/amdgpu: Use cached number of rings from the AMDGPU_INFO_HW_IP_INFO
>     ioctl
>   drm/amdgpu: Skip not present rings in amdgpu_ctx_mgr_usage
>   drm/amdgpu: Show engine capacity in fdinfo
>   drm/amdgpu: Only show VRAM in fdinfo if it exists
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c    |  3 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 39 +++++++++-----
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    | 62 +++-------------------
>  5 files changed, 49 insertions(+), 70 deletions(-)
>
> --
> 2.44.0




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux