RE: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

Yes, it fixes regressions in KFDTest introduced by this commit ("b93ed51c32ca drm/amdgpu: fix AGP init order"),

e.g.  KFDMemoryTest.MemoryRegister failure:

fault addr 0x00008084575a6000 is calculated by (gart_start + AGP aperture mc addr) wrongly.

[   46.662856] amdgpu 0000:c2:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[   46.662890] amdgpu 0000:c2:00.0: amdgpu:   in page starting at address 0x00008084575a6000 from client 10
[   46.662909] amdgpu 0000:c2:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040B52
[   46.662923] amdgpu 0000:c2:00.0: amdgpu:      Faulty UTCL2 client ID: CPC (0x5)
[   46.662936] amdgpu 0000:c2:00.0: amdgpu:      MORE_FAULTS: 0x0
[   46.662947] amdgpu 0000:c2:00.0: amdgpu:      WALKER_ERROR: 0x1
[   46.662957] amdgpu 0000:c2:00.0: amdgpu:      PERMISSION_FAULTS: 0x5
[   46.662968] amdgpu 0000:c2:00.0: amdgpu:      MAPPING_ERROR: 0x1
[   46.662979] amdgpu 0000:c2:00.0: amdgpu:      RW: 0x1


-----Original Message-----
From: Alex Deucher <alexdeucher@xxxxxxxxx>
Sent: Thursday, November 16, 2023 10:26 PM
To: Zhang, Yifan <Yifan1.Zhang@xxxxxxx>
Cc: Koenig, Christian <Christian.Koenig@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>
Subject: Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not at 0

On Thu, Nov 16, 2023 at 4:37 AM Zhang, Yifan <Yifan1.Zhang@xxxxxxx> wrote:
>
> [AMD Official Use Only - General]
>
> Ping... this patch seems still not merged.
>

Can you confirm it fixes the AGP issues you saw?

Alex

> Best Regards,
> Yifan
>
> -----Original Message-----
> From: Alex Deucher <alexdeucher@xxxxxxxxx>
> Sent: Monday, November 13, 2023 2:13 AM
> To: Koenig, Christian <Christian.Koenig@xxxxxxx>
> Cc: Deucher, Alexander <Alexander.Deucher@xxxxxxx>;
> amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Zhang, Yifan <Yifan1.Zhang@xxxxxxx>;
> Zhang, Jesse(Jie) <Jesse.Zhang@xxxxxxx>
> Subject: Re: [PATCH] drm/amdgpu: fix AGP addressing when GART is not
> at 0
>
> On Sat, Nov 11, 2023 at 2:17 AM Christian König <christian.koenig@xxxxxxx> wrote:
> >
> > Am 10.11.23 um 15:47 schrieb Alex Deucher:
> > > This worked by luck if the GART aperture ended up at 0.  When we
> > > ended up moving GART on some chips, the GART aperture ended up
> > > offsetting the the AGP address since the resource->start is a GART
> > > offset, not an MC address.  Fix this by moving the AGP address
> > > setup into amdgpu_bo_gpu_offset_no_check().
> > >
> > > Reported-by: Jesse Zhang <Jesse.Zhang@xxxxxxx>
> > > Reported-by: Yifan Zhang <yifan1.zhang@xxxxxxx>
> > > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> > > Cc: christian.koenig@xxxxxxx
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 10 +++++++---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c    |  4 +---
> > >   2 files changed, 8 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > index cef920a93924..1b3e97522838 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > @@ -1527,10 +1527,14 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
> > >   u64 amdgpu_bo_gpu_offset_no_check(struct amdgpu_bo *bo)
> > >   {
> > >       struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> > > -     uint64_t offset;
> > > +     uint64_t offset, addr;
> > >
> > > -     offset = (bo->tbo.resource->start << PAGE_SHIFT) +
> > > -              amdgpu_ttm_domain_start(adev, bo->tbo.resource->mem_type);
> > > +     addr = amdgpu_gmc_agp_addr(&bo->tbo);
> >
> > IIRC you must check bo->tbo.resource->mem_type before calling
> > amdgpu_gmc_agp_addr().
>
> Yes, this was fixed in v2.
>
> Alex
>
> >
> > Regards,
> > Christian.
> >
> > > +     if (addr != AMDGPU_BO_INVALID_OFFSET)
> > > +             offset = addr;
> > > +     else
> > > +             offset = (bo->tbo.resource->start << PAGE_SHIFT) +
> > > +                     amdgpu_ttm_domain_start(adev,
> > > + bo->tbo.resource->mem_type);
> > >
> > >       return amdgpu_gmc_sign_extend(offset);
> > >   }
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > index 05991c5c8ddb..ab4a762aed5b 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > > @@ -959,10 +959,8 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
> > >               return 0;
> > >
> > >       addr = amdgpu_gmc_agp_addr(bo);
> > > -     if (addr != AMDGPU_BO_INVALID_OFFSET) {
> > > -             bo->resource->start = addr >> PAGE_SHIFT;
> > > +     if (addr != AMDGPU_BO_INVALID_OFFSET)
> > >               return 0;
> > > -     }
> > >
> > >       /* allocate GART space */
> > >       placement.num_placement = 1;
> >




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux