Hi, I found a commit that on certain video files leads to problems with VAAPI hardware decoding. Reproducing the issue requires mesa to be built with the h264 hardware encoder enabled and the attached file to be playable in the vlc player. Before kernel 5.16 this only led to an artifact in the form of a green bar at the top of the screen, then starting from 5.17 the GPU began to freeze. In 6.0, the problem with GPU freezing is solved, but the kernel itself freezes when certain actions are performed. And the vlc application cannot be terminated in any way. The kernel trace would be like: [ 976.184187] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread vlc:cs0 pid 9956) [ 976.184205] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800106b53000 from client 0x12 (VMC) [ 976.184210] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00141651 [ 976.184213] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: VCN0 (0xb) [ 976.184216] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 976.184219] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 976.184222] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [ 976.184225] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 976.184228] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 976.184234] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread vlc:cs0 pid 9956) [ 976.184238] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800106b52000 from client 0x12 (VMC) [ 976.184242] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [ 976.184245] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x0) [ 976.184248] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 976.184251] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 976.184253] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [ 976.184256] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 976.184259] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 976.184264] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:40 vmid:1 pasid:32785, for process vlc pid 9905 thread vlc:cs0 pid 9956) [ 976.184268] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000800106b53000 from client 0x12 (VMC) [ 976.184271] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [ 976.184273] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x0) [ 976.184276] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 976.184279] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 976.184281] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [ 976.184284] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 976.184286] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 The problematic commit is: commit 7cbe08a930a132d84b4cf79953b00b074ec7a2a7 (HEAD) Author: Alex Deucher <alexander.deucher@xxxxxxx> Date: Mon Aug 9 11:22:20 2021 -0400 drm/amdgpu: handle VCN instances when harvesting (v2) There may be multiple instances and only one is harvested. v2: fix typo in commit message Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673 Reviewed-by: Guchun Chen <guchun.chen@xxxxxxx> Reviewed-by: James Zhu <James.Zhu@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Thanks! -- Best Regards, Mike Gavrilov.
Attachment:
test_sample_480_2.mp4
Description: video/mp4