On Thu, Mar 7, 2024 at 12:00 PM Sunil Khatri <sunil.khatri@xxxxxxx> wrote: > > Add page fault information to the devcoredump. > > Output of devcoredump: > **** AMDGPU Device Coredump **** > version: 1 > kernel: 6.7.0-amd-staging-drm-next > module: amdgpu > time: 29.725011811 > process_name: soft_recovery_p PID: 1720 > > Ring timed out details > IP Type: 0 Ring Name: gfx_0.0.0 > > [gfxhub] Page fault observed > Faulty page starting at address 0x0000000000000000 Do you want a : before the address for consistency? > Protection fault status register:0x301031 How about a space after the : for consistency? For parsability, it may make more sense to just have a list of key value pairs: [GPU page fault] hub: addr: status: [Ring timeout details] IP: ring: name: etc. > > VRAM is lost due to GPU reset! > > Signed-off-by: Sunil Khatri <sunil.khatri@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c > index 147100c27c2d..dd39e614d907 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c > @@ -203,8 +203,20 @@ amdgpu_devcoredump_read(char *buffer, loff_t offset, size_t count, > coredump->ring->name); > } > > + if (coredump->adev) { > + struct amdgpu_vm_fault_info *fault_info = > + &coredump->adev->vm_manager.fault_info; > + > + drm_printf(&p, "\n[%s] Page fault observed\n", > + fault_info->vmhub ? "mmhub" : "gfxhub"); > + drm_printf(&p, "Faulty page starting at address 0x%016llx\n", > + fault_info->addr); > + drm_printf(&p, "Protection fault status register:0x%x\n", > + fault_info->status); > + } > + > if (coredump->reset_vram_lost) > - drm_printf(&p, "VRAM is lost due to GPU reset!\n"); > + drm_printf(&p, "\nVRAM is lost due to GPU reset!\n"); > if (coredump->adev->reset_info.num_regs) { > drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); > > -- > 2.34.1 >