Re: [PATCH 0/4] Ring padding CPU optimisation and some RFC bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 08.10.24 um 17:05 schrieb Tvrtko Ursulin:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>

I've noticed the hardware ring padding optimisations have landed so I decided
to respin the CPU side optimisations.

First two patches are simply adding ring fill helpers which deal with reducing
the CPU cost of emitting hundreds of nops from the for-amdgpu_ring_write loops.

If receptive for the idea, please double-check I preserved endianess behaviour
as is.

I'm pretty sure that this was broken before or at least uses HW features which are not guaranteed to work any more.

Sunil has already commited a set which does mostly the same as this here. The only thing missing is the improvements for the IB patching and a bunch of things I've been working on recently.

Going to send those out in a Minute, would be cool if you could run a few performance analysis on those patches as well since you already seem to have the setup for that.

Thanks,
Christian.


Last two patches are new and RFC. Both are incomplete conversion to two new
helpers intended to deal with an often repeated pattern of:

-               amdgpu_ring_write(ring, lower_32_bits(addr));
-               amdgpu_ring_write(ring, upper_32_bits(addr));
+               amdgpu_ring_write_addr(ring, addr);

Last patch is the most uncertain one where there seems to be some magic bit
used only on big endian. It has no name so I couldn't figure out what it was
about.

-       amdgpu_ring_write(ring,
-#ifdef __BIG_ENDIAN
-                               (2 << 0) |
-#endif
-                               lower_32_bits(ib->gpu_addr));
-       amdgpu_ring_write(ring, upper_32_bits(ib->gpu_addr));
+       amdgpu_ring_write_addr_xbe(ring, ib->gpu_addr);

Anyway, both patterns have a lot of users so reductions in source code and
binary size aside, main question is do these kind of helpers improve readability
or are making it worse.

(Note that the _xbe name in the last patch is just a placeholder.)

Cc: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
Cc: Sunil Khatri <sunil.khatri@xxxxxxx>

Tvrtko Ursulin (4):
   drm/amdgpu: More efficient ring padding
   drm/amdgpu: More more efficient ring padding
   drm/amdgpu: Add and use amdgpu_ring_write_addr() helper
   drm/amdgpu: Document the magic big endian bit

  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  19 ++++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 101 +++++++++++++++++++++++
  drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c  |   6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c  |  25 +++---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c    |  27 +++---
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   |  66 +++++----------
  drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c   |  60 +++++---------
  drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c   |  45 ++++------
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c    |  63 +++++---------
  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c  |  48 ++++-------
  drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c   |   8 +-
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c   |   8 +-
  drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c |   8 +-
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c   |  16 ++--
  drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c    |   7 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c    |   7 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c    |   7 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c    |   7 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c    |   9 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c    |   8 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c    |   7 +-
  28 files changed, 319 insertions(+), 345 deletions(-)





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux