[PATCH 1/3] drm/amdpgu: Micro-optimise amdgpu_ring_commit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>

For some value of optimisation we can replace the division with an
bitwise and. And it even shrinks the code. Before:

     6c9:       53                      push   %rbx
     6ca:       4c 8b 47 08             mov    0x8(%rdi),%r8
     6ce:       31 d2                   xor    %edx,%edx
     6d0:       48 89 fb                mov    %rdi,%rbx
     6d3:       8b 87 c8 05 00 00       mov    0x5c8(%rdi),%eax
     6d9:       41 8b 48 04             mov    0x4(%r8),%ecx
     6dd:       f7 d0                   not    %eax
     6df:       21 c8                   and    %ecx,%eax
     6e1:       83 c1 01                add    $0x1,%ecx
     6e4:       83 c0 01                add    $0x1,%eax
     6e7:       f7 f1                   div    %ecx
     6e9:       89 d6                   mov    %edx,%esi
     6eb:       41 ff 90 88 00 00 00    call   *0x88(%r8)

After:

     6c9:       53                      push   %rbx
     6ca:       48 8b 57 08             mov    0x8(%rdi),%rdx
     6ce:       48 89 fb                mov    %rdi,%rbx
     6d1:       8b 87 c8 05 00 00       mov    0x5c8(%rdi),%eax
     6d7:       8b 72 04                mov    0x4(%rdx),%esi
     6da:       f7 d0                   not    %eax
     6dc:       21 f0                   and    %esi,%eax
     6de:       83 c0 01                add    $0x1,%eax
     6e1:       21 c6                   and    %eax,%esi
     6e3:       ff 92 88 00 00 00       call   *0x88(%rdx)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>
Cc: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index ad49cecb20b8..1a1963318424 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -144,7 +144,7 @@ void amdgpu_ring_commit(struct amdgpu_ring *ring)
 	/* We pad to match fetch size */
 	count = ring->funcs->align_mask + 1 -
 		(ring->wptr & ring->funcs->align_mask);
-	count %= ring->funcs->align_mask + 1;
+	count &= ring->funcs->align_mask;
 	ring->funcs->insert_nop(ring, count);
 
 	mb();
-- 
2.44.0




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux