Re: [PATCH] drm/amdgpu: cache in more vm fault information

Alex Deucher <alexdeucher@xxxxxxxxx> · Wed, 6 Mar 2024 10:49:17 -0500

On Wed, Mar 6, 2024 at 10:32 AM Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
>
> On Wed, Mar 6, 2024 at 10:13 AM Khatri, Sunil <sukhatri@xxxxxxx> wrote:
> >
> >
> > On 3/6/2024 8:34 PM, Christian König wrote:
> > > Am 06.03.24 um 15:29 schrieb Alex Deucher:
> > >> On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil <sukhatri@xxxxxxx> wrote:
> > >>>
> > >>> On 3/6/2024 6:12 PM, Christian König wrote:
> > >>>> Am 06.03.24 um 11:40 schrieb Khatri, Sunil:
> > >>>>> On 3/6/2024 3:37 PM, Christian König wrote:
> > >>>>>> Am 06.03.24 um 10:04 schrieb Sunil Khatri:
> > >>>>>>> When an  page fault interrupt is raised there
> > >>>>>>> is a lot more information that is useful for
> > >>>>>>> developers to analyse the pagefault.
> > >>>>>> Well actually those information are not that interesting because
> > >>>>>> they are hw generation specific.
> > >>>>>>
> > >>>>>> You should probably rather use the decoded strings here, e.g. hub,
> > >>>>>> client, xcc_id, node_id etc...
> > >>>>>>
> > >>>>>> See gmc_v9_0_process_interrupt() an example.
> > >>>>>> I saw this v9 does provide more information than what v10 and v11
> > >>>>>> provide like node_id and fault from which die but thats again very
> > >>>>>> specific to IP_VERSION(9, 4, 3)) i dont know why thats information
> > >>>>>> is not there in v10 and v11.
> > >>>>> I agree to your point but, as of now during a pagefault we are
> > >>>>> dumping this information which is useful like which client
> > >>>>> has generated an interrupt and for which src and other information
> > >>>>> like address. So i think to provide the similar information in the
> > >>>>> devcoredump.
> > >>>>>
> > >>>>> Currently we do not have all this information from either job or vm
> > >>>>> being derived from the job during a reset. We surely could add more
> > >>>>> relevant information later on as per request but this information is
> > >>>>> useful as
> > >>>>> eventually its developers only who would use the dump file provided
> > >>>>> by customer to debug.
> > >>>>>
> > >>>>> Below is the information that i dump in devcore and i feel that is
> > >>>>> good information but new information could be added which could be
> > >>>>> picked later.
> > >>>>>
> > >>>>>> Page fault information
> > >>>>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
> > >>>>>> in page starting at address 0x0000000000000000 from client 0x1b
> > >>>>>> (UTCL2)
> > >>>> This is a perfect example what I mean. You record in the patch is the
> > >>>> client_id, but this is is basically meaningless unless you have access
> > >>>> to the AMD internal hw documentation.
> > >>>>
> > >>>> What you really need is the client in decoded form, in this case
> > >>>> UTCL2. You can keep the client_id additionally, but the decoded client
> > >>>> string is mandatory to have I think.
> > >>>>
> > >>>> Sure i am capturing that information as i am trying to minimise the
> > >>>> memory interaction to minimum as we are still in interrupt context
> > >>>> here that why i recorded the integer information compared to decoding
> > >>> and writing strings there itself but to postpone till we dump.
> > >>>
> > >>> Like decoding to the gfxhub/mmhub based on vmhub/vmid_src and client
> > >>> string from client id. So are we good to go with the information with
> > >>> the above information of sharing details in devcoredump using the
> > >>> additional information from pagefault cached.
> > >> I think amdgpu_vm_fault_info() has everything you need already (vmhub,
> > >> status, and addr).  client_id and src_id are just tokens in the
> > >> interrupt cookie so we know which IP to route the interrupt to. We
> > >> know what they will be because otherwise we'd be in the interrupt
> > >> handler for a different IP.  I don't think ring_id has any useful
> > >> information in this context and vmid and pasid are probably not too
> > >> useful because they are just tokens to associate the fault with a
> > >> process.  It would be better to have the process name.
> >
> > Just to share context here Alex, i am preparing this for devcoredump, my
> > intention was to replicate the information which in KMD we are sharing
> > in Dmesg for page faults. If assuming we do not add client id specially
> > we would not be able to share enough information in devcoredump.
> > It would be just address and hub(gfxhub/mmhub) and i think that is
> > partial information as src id and client id and ip block shares good
> > information.
>
> We also need to include the status register value.  That contains the
> important information (type of access, fault type, client, etc.).
> Client_id and src_id are only used to route the interrupt to the right
> software code.  E.g., a different client_id and src_id would be a
> completely different interrupt (e.g., vblank or fence, etc.).  For GPU
> page faults the client_id and src_id will always be the same.
>
> The devcoredump should also include information about the GPU itself
> as well (e.g., PCI DID/VID, maybe some of the relevant IP versions).

chip family would also be good.  And also vram size.

If we have a way to identify the chip and we have the vm status
register and vm fault address, we can decode all of the fault
information.

Alex

>
> Alex
>
> >
> > For process related information we are capturing that information part
> > of dump from existing functionality.
> > **** AMDGPU Device Coredump ****
> > version: 1
> > kernel: 6.7.0-amd-staging-drm-next
> > module: amdgpu
> > time: 45.084775181
> > process_name: soft_recovery_p PID: 1780
> >
> > Ring timed out details
> > IP Type: 0 Ring Name: gfx_0.0.0
> >
> > Page fault information
> > [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
> > in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
> > VRAM is lost due to GPU reset!
> >
> > Regards
> > Sunil
> >
> > >
> > > The decoded client name would be really useful I think since the fault
> > > handled is a catch all and handles a whole bunch of different clients.
> > >
> > > But that should be ideally passed in as const string instead of the hw
> > > generation specific client_id.
> > >
> > > As long as it's only a pointer we also don't run into the trouble that
> > > we need to allocate memory for it.
> >
> > I agree but i prefer adding the client id and decoding it in devcorecump
> > using soc15_ih_clientid_name[fault_info->client_id]) is better else we
> > have to do an sprintf this string to fault_info in irq context which is
> > writing more bytes to memory i guess compared to an integer:)
> >
> > We can argue on values like pasid and vmid and ring id to be taken off
> > if they are totally not useful.
> >
> > Regards
> > Sunil
> >
> > >
> > > Christian.
> > >
> > >>
> > >> Alex
> > >>
> > >>> regards
> > >>> sunil
> > >>>
> > >>>> Regards,
> > >>>> Christian.
> > >>>>
> > >>>>> Regards
> > >>>>> Sunil Khatri
> > >>>>>
> > >>>>>> Regards,
> > >>>>>> Christian.
> > >>>>>>
> > >>>>>>> Add all such information in the last cached
> > >>>>>>> pagefault from an interrupt handler.
> > >>>>>>>
> > >>>>>>> Signed-off-by: Sunil Khatri <sunil.khatri@xxxxxxx>
> > >>>>>>> ---
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++--
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 7 ++++++-
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 2 +-
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 2 +-
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 2 +-
> > >>>>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
> > >>>>>>>    7 files changed, 18 insertions(+), 8 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > >>>>>>> index 4299ce386322..b77e8e28769d 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > >>>>>>> @@ -2905,7 +2905,7 @@ void amdgpu_debugfs_vm_bo_info(struct
> > >>>>>>> amdgpu_vm *vm, struct seq_file *m)
> > >>>>>>>     * Cache the fault info for later use by userspace in debugging.
> > >>>>>>>     */
> > >>>>>>>    void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
> > >>>>>>> -                  unsigned int pasid,
> > >>>>>>> +                  struct amdgpu_iv_entry *entry,
> > >>>>>>>                      uint64_t addr,
> > >>>>>>>                      uint32_t status,
> > >>>>>>>                      unsigned int vmhub)
> > >>>>>>> @@ -2915,7 +2915,7 @@ void amdgpu_vm_update_fault_cache(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>> xa_lock_irqsave(&adev->vm_manager.pasids, flags);
> > >>>>>>>    -    vm = xa_load(&adev->vm_manager.pasids, pasid);
> > >>>>>>> +    vm = xa_load(&adev->vm_manager.pasids, entry->pasid);
> > >>>>>>>        /* Don't update the fault cache if status is 0.  In the
> > >>>>>>> multiple
> > >>>>>>>         * fault case, subsequent faults will return a 0 status
> > >>>>>>> which is
> > >>>>>>>         * useless for userspace and replaces the useful fault
> > >>>>>>> status, so
> > >>>>>>> @@ -2924,6 +2924,11 @@ void amdgpu_vm_update_fault_cache(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>        if (vm && status) {
> > >>>>>>>            vm->fault_info.addr = addr;
> > >>>>>>>            vm->fault_info.status = status;
> > >>>>>>> +        vm->fault_info.client_id = entry->client_id;
> > >>>>>>> +        vm->fault_info.src_id = entry->src_id;
> > >>>>>>> +        vm->fault_info.vmid = entry->vmid;
> > >>>>>>> +        vm->fault_info.pasid = entry->pasid;
> > >>>>>>> +        vm->fault_info.ring_id = entry->ring_id;
> > >>>>>>>            if (AMDGPU_IS_GFXHUB(vmhub)) {
> > >>>>>>>                vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX;
> > >>>>>>>                vm->fault_info.vmhub |=
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> index 047ec1930d12..c7782a89bdb5 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> @@ -286,6 +286,11 @@ struct amdgpu_vm_fault_info {
> > >>>>>>>        uint32_t    status;
> > >>>>>>>        /* which vmhub? gfxhub, mmhub, etc. */
> > >>>>>>>        unsigned int    vmhub;
> > >>>>>>> +    unsigned int    client_id;
> > >>>>>>> +    unsigned int    src_id;
> > >>>>>>> +    unsigned int    ring_id;
> > >>>>>>> +    unsigned int    pasid;
> > >>>>>>> +    unsigned int    vmid;
> > >>>>>>>    };
> > >>>>>>>      struct amdgpu_vm {
> > >>>>>>> @@ -605,7 +610,7 @@ static inline void
> > >>>>>>> amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
> > >>>>>>>    }
> > >>>>>>>      void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
> > >>>>>>> -                  unsigned int pasid,
> > >>>>>>> +                  struct amdgpu_iv_entry *entry,
> > >>>>>>>                      uint64_t addr,
> > >>>>>>>                      uint32_t status,
> > >>>>>>>                      unsigned int vmhub);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > >>>>>>> index d933e19e0cf5..6b177ce8db0e 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> > >>>>>>> @@ -150,7 +150,7 @@ static int gmc_v10_0_process_interrupt(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>            status = RREG32(hub->vm_l2_pro_fault_status);
> > >>>>>>>            WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> > >>>>>>>    -        amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
> > >>>>>>> status,
> > >>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> > >>>>>>>                             entry->vmid_src ? AMDGPU_MMHUB0(0) :
> > >>>>>>> AMDGPU_GFXHUB(0));
> > >>>>>>>        }
> > >>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > >>>>>>> index 527dc917e049..bcf254856a3e 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
> > >>>>>>> @@ -121,7 +121,7 @@ static int gmc_v11_0_process_interrupt(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>            status = RREG32(hub->vm_l2_pro_fault_status);
> > >>>>>>>            WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> > >>>>>>>    -        amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
> > >>>>>>> status,
> > >>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> > >>>>>>>                             entry->vmid_src ? AMDGPU_MMHUB0(0) :
> > >>>>>>> AMDGPU_GFXHUB(0));
> > >>>>>>>        }
> > >>>>>>>    diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>>>>>> index 3da7b6a2b00d..e9517ebbe1fd 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> > >>>>>>> @@ -1270,7 +1270,7 @@ static int gmc_v7_0_process_interrupt(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>        if (!addr && !status)
> > >>>>>>>            return 0;
> > >>>>>>>    -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
> > >>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
> > >>>>>>>                         ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
> > >>>>>>> status, AMDGPU_GFXHUB(0));
> > >>>>>>>          if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > >>>>>>> index d20e5f20ee31..a271bf832312 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> > >>>>>>> @@ -1438,7 +1438,7 @@ static int gmc_v8_0_process_interrupt(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>        if (!addr && !status)
> > >>>>>>>            return 0;
> > >>>>>>>    -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
> > >>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
> > >>>>>>>                         ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
> > >>>>>>> status, AMDGPU_GFXHUB(0));
> > >>>>>>>          if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > >>>>>>> index 47b63a4ce68b..dc9fb1fb9540 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > >>>>>>> @@ -666,7 +666,7 @@ static int gmc_v9_0_process_interrupt(struct
> > >>>>>>> amdgpu_device *adev,
> > >>>>>>>        rw = REG_GET_FIELD(status, VM_L2_PROTECTION_FAULT_STATUS,
> > >>>>>>> RW);
> > >>>>>>>        WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
> > >>>>>>>    -    amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
> > >>>>>>> status, vmhub);
> > >>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry, addr, status,
> > >>>>>>> vmhub);
> > >>>>>>>          dev_err(adev->dev,
> > >>>>>>>            "VM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",
> > >