[Public] > -----Original Message----- > From: Kuehling, Felix <Felix.Kuehling@xxxxxxx> > Sent: Friday, August 9, 2024 7:49 PM > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, GuoQing (Sam) > <GuoQing.Zhang@xxxxxxx>; Kim, Jonathan <Jonathan.Kim@xxxxxxx> > Subject: Re: [PATCH] drm/amdgpu: fix > KFDMemoryTest.PtraceAccessInvisibleVram fail on SRIOV > > > On 2024-08-07 04:36, Samuel Zhang wrote: > > Ptrace access VRAM bo will first try sdma access in > > amdgpu_ttm_access_memory_sdma(), if fails, it will fallback to mmio > > access. > > > > Since ptrace only access 8 bytes at a time and > > amdgpu_ttm_access_memory_sdma() only allow PAGE_SIZE bytes access, > > it returns fail. > > On SRIOV, mmio access will also fail as MM_INDEX/MM_DATA register write > > is blocked for security reasons. > > > > The fix is just change len check in amdgpu_ttm_access_memory_sdma() so > > that len in (0, PAGE_SIZE] are allowed. This will not fix the ptrace > > test case on SRIOV, but also improve the access performance when the > > access length is < PAGE_SIZE. > > len > PAGE_SIZE case support is not needed as larger size will be break > > into chunks of PAGE_SIZE len max in mem_rw(). > > I'm not convinced that using SDMA for small accesses is the best > solution for all cases. For example, on large-BAR GPUs we should fall > back to access through the FB BAR before we use indirect register > access. That may still perform better than SDMA especially for very > small accesses like 4-bytes typical for ptrace accesses. Maybe this > needs an SRIOV-VF-specific condition if MMIO register access is not an > option there. > > @Jonathan Kim, can you chime in as well? Relaxing length checks only under SRIOV mode is probably a good idea. SDMA enqueue latency impacting performance for sub-page copy sizes has been observed in the past. Plus MMIO is stable even if SDMA is dead. I know we have fallbacks anyways in the general case, but it's probably better not to prod a defunct SDMA if we don't have to. Thanks, Jon > > Thanks, > Felix > > > > > > Signed-off-by: Samuel Zhang <guoqing.zhang@xxxxxxx> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > index 5daa05e23ddf..a6e90eada367 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c > > @@ -1486,7 +1486,7 @@ static int > amdgpu_ttm_access_memory_sdma(struct ttm_buffer_object *bo, > > unsigned int num_dw; > > int r, idx; > > > > - if (len != PAGE_SIZE) > > + if (len > PAGE_SIZE) > > return -EINVAL; > > > > if (!adev->mman.sdma_access_ptr) > > @@ -1514,7 +1514,7 @@ static int > amdgpu_ttm_access_memory_sdma(struct ttm_buffer_object *bo, > > swap(src_addr, dst_addr); > > > > amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr, dst_addr, > > - PAGE_SIZE, 0); > > + len, 0); > > > > amdgpu_ring_pad_ib(adev->mman.buffer_funcs_ring, &job->ibs[0]); > > WARN_ON(job->ibs[0].length_dw > num_dw);