Re: [PATCH 0/2] Fit one IB struct amdgpu_job into a 512 byte slab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 24/02/2025 12:06, Tvrtko Ursulin wrote:
A lot of the workloads create jobs with just one IB and if we re-order some
struct members we can stop that allocation spilling into the 1k SLAB bucket.

Before:

   sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 480 + 40 = 520

After:

   sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 472 + 32 = 504

It is not a huge gain in the big picture but every little helps.

FWIW it is also quite* possible to make two IB jobs fit into 512 by converting booleans to flags and shrinking some fields:

            /* size: 448, cachelines: 7, members: 24 */
            /* forced alignments: 1 */

So 448 + 2 * 64 = 512 !

That avoids spilling _any_ submissions, for example from Cyberpunk 2077, into the 1k SLAB bucket.

*) I said quite because as after I converted booleans to flags, which required u16 for 9 flags, shrunk vmid and num_ibs to u8 and job_run_counter to u16 (all of which seems completely fine), I needed just a tiny bit extra. So I shrank gws_size to u16. Being a size in pages that could also easily be large enough.

Regards,

Tvrtko

Tvrtko Ursulin (2):
   drm/amdgpu: Remove hole from struct amdgpu_ib
   drm/amdgpu: Reduce holes in struct amdgpu_job

  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h  | 19 ++++++++-----------
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h |  4 ++--
  2 files changed, 10 insertions(+), 13 deletions(-)





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux