On Fri, Dec 9, 2022 at 7:37 PM Leo Liu <leo.liu@xxxxxxx> wrote: > > Please try the latest AMDGPU driver: > > https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/ > Sorry Leo, I miss your message. This issue is still actual for 6.2-rc8. In my first message I was mistaken. > Before kernel 5.16 this only led to an artifact in the form of > a green bar at the top of the screen, then starting from 5.17 > the GPU began to freeze. The real behaviour before 5.18: - vlc could plays video with small artifacts in the form of a green bar on top of the video - after playing video process vlc correctly exiting On 5.18 this behaviour changed: - vlc show black screen instead of playing video - after playing the process not exiting - if I tries kill vlc process with 'kill -9' vlc became zombi process and many other processes start hangs (in kernel log appears follow lines after 2 minutes) INFO: task vlc:sh8:5248 blocked for more than 122 seconds. Tainted: G W L -------- --- 5.18.0-60.fc37.x86_64+debug #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:vlc:sh8 state:D stack:13616 pid: 5248 ppid: 1934 flags:0x00004006 Call Trace: <TASK> __schedule+0x492/0x1650 ? _raw_spin_unlock_irqrestore+0x40/0x60 ? debug_check_no_obj_freed+0x12d/0x250 schedule+0x4e/0xb0 schedule_timeout+0xe1/0x120 ? lock_release+0x215/0x460 ? trace_hardirqs_on+0x1a/0xf0 ? _raw_spin_unlock_irqrestore+0x40/0x60 dma_fence_default_wait+0x197/0x240 ? __bpf_trace_dma_fence+0x10/0x10 dma_fence_wait_timeout+0x229/0x260 drm_sched_entity_fini+0x101/0x270 [gpu_sched] amdgpu_vm_fini+0x2b5/0x460 [amdgpu] ? idr_destroy+0x70/0xb0 ? mutex_destroy+0x1e/0x50 amdgpu_driver_postclose_kms+0x1ec/0x2c0 [amdgpu] drm_file_free.part.0+0x20d/0x260 drm_release+0x6a/0x120 __fput+0xab/0x270 task_work_run+0x5c/0xa0 do_exit+0x394/0xc40 ? rcu_read_lock_sched_held+0x10/0x70 do_group_exit+0x33/0xb0 get_signal+0xbbc/0xbc0 arch_do_signal_or_restart+0x30/0x770 ? do_futex+0xfd/0x190 ? __x64_sys_futex+0x63/0x190 exit_to_user_mode_prepare+0x172/0x270 syscall_exit_to_user_mode+0x16/0x50 do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? rcu_read_lock_sched_held+0x10/0x70 ? trace_hardirqs_on_prepare+0x5e/0x110 ? do_syscall_64+0x67/0x80 ? rcu_read_lock_sched_held+0x10/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f82c2364529 RSP: 002b:00007f8210ff8c00 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007f82c2364529 RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007f823022542c RBP: 00007f8210ff8c30 R08: 0000000000000000 R09: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000001 R15: 00007f823022542c </TASK> INFO: lockdep is turned off. I bisected this issue and problematic commit is ❯ git bisect bad 5f3854f1f4e211f494018160b348a1c16e58013f is the first bad commit commit 5f3854f1f4e211f494018160b348a1c16e58013f Author: Alex Deucher <alexander.deucher@xxxxxxx> Date: Thu Mar 24 18:04:00 2022 -0400 drm/amdgpu: add more cases to noretry=1 Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 +++ 1 file changed, 3 insertions(+) Unfortunately I couldn't simply revert this commit on 6.2-rc8 for checking, because it leads to conflicts. Alex, you as author of this commit could help me with it? -- Best Regards, Mike Gavrilov.