Re: After Vega 56/64 GPU hang I unable reboot system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No gfx ring? You can specify a ring name for --waves should be in the docs.

It's not on the web docs but in the help text

https://cgit.freedesktop.org/amd/umr/tree/src/app/main.c#n643

I'll fix the web docs when I'm in next.

Tom

On December 19, 2018 3:21:25 PM EST, "Grodzovsky, Andrey" <Andrey.Grodzovsky@xxxxxxx> wrote:
+Tom

Andrey


On 12/19/2018 01:35 PM, Mikhail Gavrilov wrote:
On Tue, 18 Dec 2018 at 00:08, Grodzovsky, Andrey
<Andrey.Grodzovsky@xxxxxxx> wrote:
Please install UMR and dump gfx ring content and waves after the hang is
happening.

UMR at - https://cgit.freedesktop.org/amd/umr/
Waves dump
sudo umr -O verbose,halt_waves -wa
GFX ring dump
sudo umr -O verbose,follow -R gfx[.]

Andrey

Thanks for respond.

What options should I specify in kernel command line?

On my setup `umr` terminated with message `Could not open ring debugfs
file` and crashes. But I am sure that debugfs enabled.

$ sudo umr -O verbose,halt_waves -wa
Cannot seek to MMIO address: Bad file descriptor
[ERROR]: Could not open ring debugfs fileSegmentation fault


# ls /sys/kernel/debug/dri/0/
amdgpu_dm_dtn_log amdgpu_ring_comp_1.1.0 amdgpu_vram_mm
amdgpu_evict_gtt amdgpu_ring_comp_1.1.1 amdgpu_wave
amdgpu_evict_vram amdgpu_ring_comp_1.2.0 clients
amdgpu_fence_info amdgpu_ring_comp_1.2.1 crtc-0
amdgpu_firmware_info amdgpu_ring_comp_1.3.0 crtc-1
amdgpu_gca_config amdgpu_ring_comp_1.3.1 crtc-2
amdgpu_gds_mm amdgpu_ring_gfx crtc-3
amdgpu_gem_info amdgpu_ring_kiq_2.1.0 crtc-4
amdgpu_gpr amdgpu_ring_sdma0 crtc-5
amdgpu_gpu_recover amdgpu_ring_sdma1 DP-1
amdgpu_gtt_mm 'amdgpu_ring_uvd<0>' DP-2
amdgpu_gws_mm 'amdgpu_ring_uvd_enc0<0>' DP-3
amdgpu_iomem 'amdgpu_ring_uvd_enc1<0>' framebuffer
amdgpu_oa_mm amdgpu_ring_vce0 gem_names
amdgpu_pm_info amdgpu_ring_vce1 HDMI-A-1
amdgpu_regs amdgpu_ring_vce2 HDMI-A-2
amdgpu_regs_didt amdgpu_sa_info HDMI-A-3
amdgpu_regs_pcie amdgpu_sensors internal_clients
amdgpu_regs_smc amdgpu_test_ib name
amdgpu_ring_comp_1.0.0 amdgpu_vbios state
amdgpu_ring_comp_1.0.1 amdgpu_vram ttm_page_pool




--
Best Regards,
Mike Gavrilov.


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux