RE: [PATCH] drm/amdgpu: fix KFDMemoryTest.PtraceAccessInvisibleVram fail on SRIOV

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Public]

> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling@xxxxxxx>
> Sent: Friday, August 9, 2024 7:49 PM
> To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, GuoQing (Sam)
> <GuoQing.Zhang@xxxxxxx>; Kim, Jonathan <Jonathan.Kim@xxxxxxx>
> Subject: Re: [PATCH] drm/amdgpu: fix
> KFDMemoryTest.PtraceAccessInvisibleVram fail on SRIOV
>
>
> On 2024-08-07 04:36, Samuel Zhang wrote:
> > Ptrace access VRAM bo will first try sdma access in
> > amdgpu_ttm_access_memory_sdma(), if fails, it will fallback to mmio
> > access.
> >
> > Since ptrace only access 8 bytes at a time and
> > amdgpu_ttm_access_memory_sdma() only allow PAGE_SIZE bytes access,
> > it returns fail.
> > On SRIOV, mmio access will also fail as MM_INDEX/MM_DATA register write
> > is blocked for security reasons.
> >
> > The fix is just change len check in amdgpu_ttm_access_memory_sdma() so
> > that len in (0, PAGE_SIZE] are allowed. This will not fix the ptrace
> > test case on SRIOV, but also improve the access performance when the
> > access length is < PAGE_SIZE.
> > len > PAGE_SIZE case support is not needed as larger size will be break
> > into chunks of PAGE_SIZE len max in mem_rw().
>
> I'm not convinced that using SDMA for small accesses is the best
> solution for all cases. For example, on large-BAR GPUs we should fall
> back to access through the FB BAR before we use indirect register
> access. That may still perform better than SDMA especially for very
> small accesses like 4-bytes typical for ptrace accesses. Maybe this
> needs an SRIOV-VF-specific condition if MMIO register access is not an
> option there.
>
> @Jonathan Kim, can you chime in as well?

Relaxing length checks only under SRIOV mode is probably a good idea.
SDMA enqueue latency impacting performance for sub-page copy sizes has been observed in the past.
Plus MMIO is stable even if SDMA is dead.
I know we have fallbacks anyways in the general case, but it's probably better not to prod a defunct SDMA if we don't have to.

Thanks,

Jon

>
> Thanks,
>    Felix
>
>
> >
> > Signed-off-by: Samuel Zhang <guoqing.zhang@xxxxxxx>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 5daa05e23ddf..a6e90eada367 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1486,7 +1486,7 @@ static int
> amdgpu_ttm_access_memory_sdma(struct ttm_buffer_object *bo,
> >     unsigned int num_dw;
> >     int r, idx;
> >
> > -   if (len != PAGE_SIZE)
> > +   if (len > PAGE_SIZE)
> >             return -EINVAL;
> >
> >     if (!adev->mman.sdma_access_ptr)
> > @@ -1514,7 +1514,7 @@ static int
> amdgpu_ttm_access_memory_sdma(struct ttm_buffer_object *bo,
> >             swap(src_addr, dst_addr);
> >
> >     amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr, dst_addr,
> > -                           PAGE_SIZE, 0);
> > +                           len, 0);
> >
> >     amdgpu_ring_pad_ib(adev->mman.buffer_funcs_ring, &job->ibs[0]);
> >     WARN_ON(job->ibs[0].length_dw > num_dw);




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux