Re: [PATCH 5/8] drm/amdgpu: use the new cursor in amdgpu_ttm_access_memory

Christian König <ckoenig.leichtzumerken@xxxxxxxxx> · Sat, 20 Mar 2021 10:08:04 +0100

Yeah, Nirmoy already stumbled over the correct fix.

The size check is off by one. Patch to fix this should be pushed on Monday.

Regards,
Christian.

Am 19.03.21 um 22:23 schrieb Felix Kuehling:
This is causing a deadlock in amdgpu_ttm_access_memory during the 
PtraceAccess test in kfdtest. Unfortunately it doesn't get flagged by 
LOCKDEP. See the kernel log snippet below. I don't have a good 
explanation what's going on other than maybe some data structure 
corruption.

With this patch reverted the PtraceAccess test still fails, but it 
doesn't hang any more. If I also revert "use new cursor in 
amdgpu_ttm_io_mem_pfn" (which is used via amdgpu_find_mm_node in 
amdgpu_ttm_access_memory), Ptrace access starts working correctly. 
That tells me that there is some fundamental bug in the resource 
cursor implementation that's breaking several users.

Regards,
  Felix


[  129.446085] watchdog: BUG: soft lockup - CPU#8 stuck for 22s! 
[kfdtest:3588]
[  129.455379] Modules linked in: ip6table_filter ip6_tables 
iptable_filter amdgpu x86_pkg_temp_thermal drm_ttm_helper ttm iommu_v2 
gpu_sched ip_tables x_tables
[  129.455428] irq event stamp: 75294000
[  129.455432] hardirqs last  enabled at (75293999): 
[<ffffffffa9dcfd7d>] _raw_spin_unlock_irqrestore+0x2d/0x40
[  129.455447] hardirqs last disabled at (75294000): 
[<ffffffffa9dbef8a>] sysvec_apic_timer_interrupt+0xa/0xa0
[  129.455457] softirqs last  enabled at (75184000): 
[<ffffffffaa000306>] __do_softirq+0x306/0x429
[  129.455467] softirqs last disabled at (75183995): 
[<ffffffffa9e00f8f>] asm_call_irq_on_stack+0xf/0x20
[  129.455477] CPU: 8 PID: 3588 Comm: kfdtest Not tainted 
5.11.0-kfd-fkuehlin #194
[  129.455485] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 
3201 06/17/2016
[  129.455490] RIP: 0010:_raw_spin_lock_irqsave+0xb/0x50
[  129.455498] Code: d2 31 f6 e8 e7 e9 31 ff 48 89 df 58 5b e9 7d 32 
32 ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 
fd 53 9c <5b> fa f6 c7 02 74 05 e8 59 06 3e ff 65 ff 05 92 73 24 56 48 
8d 7d
[  129.455505] RSP: 0018:ffffa3eb407f3c58 EFLAGS: 00000246
[  129.455513] RAX: ffff96466e2010a0 RBX: ffff96466e200000 RCX: 
0000000000000000
[  129.455519] RDX: ffffa3eb407f3e70 RSI: 00000000019ffff0 RDI: 
ffff96466e2010a0
[  129.455524] RBP: ffff96466e2010a0 R08: 0000000000000000 R09: 
0000000000000001
[  129.455528] R10: ffffa3eb407f3c60 R11: ffff96466e2010b8 R12: 
00000000019ffff0
[  129.455533] R13: 00000000019ffff0 R14: ffffa3eb407f3e70 R15: 
00000000019ffff0
[  129.455538] FS:  00007f5aad0f0740(0000) GS:ffff96467fc00000(0000) 
knlGS:0000000000000000
[  129.455544] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  129.455549] CR2: 0000563ea76ad0f0 CR3: 00000007c6e92005 CR4: 
00000000001706e0
[  129.455554] Call Trace:
[  129.455563]  amdgpu_device_vram_access+0xc1/0x200 [amdgpu]
[  129.455820]  ? _raw_spin_unlock_irqrestore+0x2d/0x40
[  129.455834]  amdgpu_ttm_access_memory+0x29e/0x320 [amdgpu]
[  129.456063]  ttm_bo_vm_access+0x1c8/0x3a0 [ttm]
[  129.456089]  __access_remote_vm+0x289/0x390
[  129.456112]  ptrace_access_vm+0x98/0xc0
[  129.456127]  generic_ptrace_peekdata+0x31/0x80
[  129.456138]  ptrace_request+0x13b/0x5d0
[  129.456155]  arch_ptrace+0x24f/0x2f0
[  129.456165]  __x64_sys_ptrace+0xc9/0x140
[  129.456177]  do_syscall_64+0x2d/0x40
[  129.456185]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  129.456194] RIP: 0033:0x7f5aab861a3f
[  129.456199] Code: 48 89 44 24 18 48 8d 44 24 30 c7 44 24 10 18 00 
00 00 8b 70 08 48 8b 50 10 48 89 44 24 20 4c 0f 43 50 18 b8 65 00 00 
00 0f 05 <48> 3d 00 f0 ff ff 77 41 48 85 c0 78 06 41 83 f8 02 76 1e 48 
8b 4c
[  129.456205] RSP: 002b:00007ffd27b68750 EFLAGS: 00000293 ORIG_RAX: 
0000000000000065
[  129.456214] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 
00007f5aab861a3f
[  129.456219] RDX: 00007f5aab3ffff0 RSI: 0000000000000dfa RDI: 
0000000000000002
[  129.456224] RBP: 00007ffd27b68870 R08: 0000000000000001 R09: 
0000000000000000
[  129.456228] R10: 00007ffd27b68758 R11: 0000000000000293 R12: 
0000563ea764e2aa
[  129.456233] R13: 0000000000000000 R14: 0000000000000021 R15: 
0000000000000000

On 2021-03-08 8:40 a.m., Christian König wrote:
Separate the drm_mm_node walking from the actual handling.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
Acked-by: Oak Zeng <Oak.Zeng@xxxxxxx>
Tested-by: Nirmoy Das <nirmoy.das@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 67 +++++++------------------
  1 file changed, 18 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 517611b709fa..2cbe4ace591f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -178,26 +178,6 @@ static int amdgpu_verify_access(struct 
ttm_buffer_object *bo, struct file *filp)
                        filp->private_data);
  }
  -/**
- * amdgpu_find_mm_node - Helper function finds the drm_mm_node 
corresponding to
- * @offset. It also modifies the offset to be within the drm_mm_node 
returned
- *
- * @mem: The region where the bo resides.
- * @offset: The offset that drm_mm_node is used for finding.
- *
- */
-static struct drm_mm_node *amdgpu_find_mm_node(struct ttm_resource 
*mem,
-                           uint64_t *offset)
-{
-    struct drm_mm_node *mm_node = mem->mm_node;
-
-    while (*offset >= (mm_node->size << PAGE_SHIFT)) {
-        *offset -= (mm_node->size << PAGE_SHIFT);
-        ++mm_node;
-    }
-    return mm_node;
-}
-
  /**
   * amdgpu_ttm_map_buffer - Map memory into the GART windows
   * @bo: buffer object to map
@@ -1478,41 +1458,36 @@ static bool 
amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
   * access for debugging purposes.
   */
  static int amdgpu_ttm_access_memory(struct ttm_buffer_object *bo,
-                    unsigned long offset,
-                    void *buf, int len, int write)
+                    unsigned long offset, void *buf, int len,
+                    int write)
  {
      struct amdgpu_bo *abo = ttm_to_amdgpu_bo(bo);
      struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
-    struct drm_mm_node *nodes;
+    struct amdgpu_res_cursor cursor;
+    unsigned long flags;
      uint32_t value = 0;
      int ret = 0;
-    uint64_t pos;
-    unsigned long flags;
        if (bo->mem.mem_type != TTM_PL_VRAM)
          return -EIO;
  -    pos = offset;
-    nodes = amdgpu_find_mm_node(&abo->tbo.mem, &pos);
-    pos += (nodes->start << PAGE_SHIFT);
-
-    while (len && pos < adev->gmc.mc_vram_size) {
-        uint64_t aligned_pos = pos & ~(uint64_t)3;
-        uint64_t bytes = 4 - (pos & 3);
-        uint32_t shift = (pos & 3) * 8;
+    amdgpu_res_first(&bo->mem, offset, len, &cursor);
+    while (cursor.remaining) {
+        uint64_t aligned_pos = cursor.start & ~(uint64_t)3;
+        uint64_t bytes = 4 - (cursor.start & 3);
+        uint32_t shift = (cursor.start & 3) * 8;
          uint32_t mask = 0xffffffff << shift;
  -        if (len < bytes) {
-            mask &= 0xffffffff >> (bytes - len) * 8;
-            bytes = len;
+        if (cursor.size < bytes) {
+            mask &= 0xffffffff >> (bytes - cursor.size) * 8;
+            bytes = cursor.size;
          }
            if (mask != 0xffffffff) {
              spin_lock_irqsave(&adev->mmio_idx_lock, flags);
              WREG32_NO_KIQ(mmMM_INDEX, ((uint32_t)aligned_pos) | 
0x80000000);
              WREG32_NO_KIQ(mmMM_INDEX_HI, aligned_pos >> 31);
-            if (!write || mask != 0xffffffff)
-                value = RREG32_NO_KIQ(mmMM_DATA);
+            value = RREG32_NO_KIQ(mmMM_DATA);
              if (write) {
                  value &= ~mask;
                  value |= (*(uint32_t *)buf << shift) & mask;
@@ -1524,21 +1499,15 @@ static int amdgpu_ttm_access_memory(struct 
ttm_buffer_object *bo,
                  memcpy(buf, &value, bytes);
              }
          } else {
-            bytes = (nodes->start + nodes->size) << PAGE_SHIFT;
-            bytes = min(bytes - pos, (uint64_t)len & ~0x3ull);
-
-            amdgpu_device_vram_access(adev, pos, (uint32_t *)buf,
-                          bytes, write);
+            bytes = cursor.size & 0x3ull;
+            amdgpu_device_vram_access(adev, cursor.start,
+                          (uint32_t *)buf, bytes,
+                          write);
          }
            ret += bytes;
          buf = (uint8_t *)buf + bytes;
-        pos += bytes;
-        len -= bytes;
-        if (pos >= (nodes->start + nodes->size) << PAGE_SHIFT) {
-            ++nodes;
-            pos = (nodes->start << PAGE_SHIFT);
-        }
+        amdgpu_res_next(&cursor, bytes);
      }
        return ret;

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx