Hi, I have an intermittent deadlock/hang in the amdgpu driver. It seems to happen when I open a new tab in qutebrowser(v1.1.1), while I am doing other stuff, like watching youtube through mpv or playing dota 2. It seems to be pretty arbitrary how often it happens. Sometimes it is once a week and sometimes multiple times a day. I have a vega 64. What happens is that the screen freezes but I still hear sound and can ssh in to the box. If I reboot it remotely, I get dropped back to tty and it tries to reboot but it gets stuck on blocking processes(mpv etc) so I have to reset it manually. Repro steps: * run qutebrowser * Do a bunch of other stuff, videos, games etc * Switch back to qutebrowser and hit "Ctrl+t" & be "lucky" This seems to happen on all release candidates for 4.15 and 4.15 itself: 4.15: [ 2211.463021] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds. [ 2211.463026] Not tainted 4.15.0-ARCH+ #1 [ 2211.463028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2211.463030] amdgpu_cs:0 D 0 1053 1051 0x00000000 [ 2211.463032] Call Trace: [ 2211.463040] ? __schedule+0x297/0x8b0 [ 2211.463043] schedule+0x2f/0x90 [ 2211.463045] schedule_timeout+0x1fd/0x3a0 [ 2211.463085] ? amdgpu_job_alloc+0x37/0xc0 [amdgpu] [ 2211.463088] dma_fence_default_wait+0x1cc/0x270 [ 2211.463090] ? dma_fence_release+0xa0/0xa0 [ 2211.463092] dma_fence_wait_timeout+0x39/0x110 [ 2211.463119] amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu] [ 2211.463145] amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu] [ 2211.463149] ? dequeue_entity+0xdc/0x460 [ 2211.463174] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2211.463185] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 2211.463194] drm_ioctl+0x2ae/0x350 [drm] [ 2211.463218] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2211.463239] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 2211.463243] do_vfs_ioctl+0xa4/0x630 [ 2211.463246] ? SyS_futex+0x12d/0x180 [ 2211.463248] SyS_ioctl+0x74/0x80 [ 2211.463251] entry_SYSCALL_64_fastpath+0x20/0x83 [ 2211.463254] RIP: 0033:0x7f21b27b6d87 [ 2211.463255] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246 [ 2334.343027] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds. [ 2334.343032] Not tainted 4.15.0-ARCH+ #1 [ 2334.343034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2334.343036] amdgpu_cs:0 D 0 1053 1051 0x00000000 [ 2334.343039] Call Trace: [ 2334.343046] ? __schedule+0x297/0x8b0 [ 2334.343049] schedule+0x2f/0x90 [ 2334.343051] schedule_timeout+0x1fd/0x3a0 [ 2334.343091] ? amdgpu_job_alloc+0x37/0xc0 [amdgpu] [ 2334.343095] dma_fence_default_wait+0x1cc/0x270 [ 2334.343097] ? dma_fence_release+0xa0/0xa0 [ 2334.343098] dma_fence_wait_timeout+0x39/0x110 [ 2334.343125] amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu] [ 2334.343151] amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu] [ 2334.343155] ? dequeue_entity+0xdc/0x460 [ 2334.343181] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2334.343191] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 2334.343200] drm_ioctl+0x2ae/0x350 [drm] [ 2334.343224] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2334.343245] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 2334.343249] do_vfs_ioctl+0xa4/0x630 [ 2334.343252] ? SyS_futex+0x12d/0x180 [ 2334.343254] SyS_ioctl+0x74/0x80 [ 2334.343257] entry_SYSCALL_64_fastpath+0x20/0x83 [ 2334.343259] RIP: 0033:0x7f21b27b6d87 [ 2334.343260] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246 [ 2457.222859] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds. [ 2457.222862] Not tainted 4.15.0-ARCH+ #1 [ 2457.222863] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2457.222864] amdgpu_cs:0 D 0 1053 1051 0x00000000 [ 2457.222866] Call Trace: [ 2457.222872] ? __schedule+0x297/0x8b0 [ 2457.222873] schedule+0x2f/0x90 [ 2457.222875] schedule_timeout+0x1fd/0x3a0 [ 2457.222900] ? amdgpu_job_alloc+0x37/0xc0 [amdgpu] [ 2457.222902] dma_fence_default_wait+0x1cc/0x270 [ 2457.222903] ? dma_fence_release+0xa0/0xa0 [ 2457.222904] dma_fence_wait_timeout+0x39/0x110 [ 2457.222918] amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu] [ 2457.222932] amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu] [ 2457.222935] ? dequeue_entity+0xdc/0x460 [ 2457.222948] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2457.222955] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 2457.222960] drm_ioctl+0x2ae/0x350 [drm] [ 2457.222972] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2457.222983] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 2457.222986] do_vfs_ioctl+0xa4/0x630 [ 2457.222989] ? SyS_futex+0x12d/0x180 [ 2457.222989] SyS_ioctl+0x74/0x80 [ 2457.222991] entry_SYSCALL_64_fastpath+0x20/0x83 [ 2457.222993] RIP: 0033:0x7f21b27b6d87 [ 2457.222993] RSP: 002b:00007f21a83acab8 EFLAGS: 00000246 [ 2580.102828] INFO: task amdgpu_cs:0:1053 blocked for more than 120 seconds. [ 2580.102831] Not tainted 4.15.0-ARCH+ #1 [ 2580.102832] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2580.102833] amdgpu_cs:0 D 0 1053 1051 0x00000000 [ 2580.102835] Call Trace: [ 2580.102841] ? __schedule+0x297/0x8b0 [ 2580.102842] schedule+0x2f/0x90 [ 2580.102843] schedule_timeout+0x1fd/0x3a0 [ 2580.102868] ? amdgpu_job_alloc+0x37/0xc0 [amdgpu] [ 2580.102871] dma_fence_default_wait+0x1cc/0x270 [ 2580.102872] ? dma_fence_release+0xa0/0xa0 [ 2580.102873] dma_fence_wait_timeout+0x39/0x110 [ 2580.102887] amdgpu_ctx_wait_prev_fence+0x46/0x80 [amdgpu] [ 2580.102900] amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu] [ 2580.102903] ? dequeue_entity+0xdc/0x460 [ 2580.102916] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2580.102923] drm_ioctl_kernel+0x5b/0xb0 [drm] [ 2580.102928] drm_ioctl+0x2ae/0x350 [drm] [ 2580.102940] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [ 2580.102951] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 2580.102953] do_vfs_ioctl+0xa4/0x630 [ 2580.102956] ? SyS_futex+0x12d/0x180 [ 2580.102957] SyS_ioctl+0x74/0x80 [ 2580.102958] entry_SYSCALL_64_fastpath+0x20/0x83 [ 2580.102960] RIP: 0033:0x7f21b27b6d87 4.15rc9: [11181.701121] INFO: task amdgpu_cs:0:828 blocked for more than 120 seconds. [11181.701126] Not tainted 4.15.0-rc9-ga8750ddca918+ #3 [11181.701127] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11181.701129] amdgpu_cs:0 D 0 828 826 0x00000000 [11181.701132] Call Trace: [11181.701140] ? __schedule+0x293/0x8a0 [11181.701143] schedule+0x2f/0x90 [11181.701145] schedule_timeout+0x1fa/0x3a0 [11181.701147] ? _raw_spin_unlock+0xa/0x20 [11181.701180] ? amdgpu_vm_update_directories+0x460/0x5e0 [amdgpu] [11181.701184] dma_fence_default_wait+0x1cc/0x270 [11181.701187] ? dma_fence_release+0xa0/0xa0 [11181.701189] dma_fence_wait_timeout+0x33/0x100 [11181.701220] amdgpu_ctx_wait_prev_fence+0x47/0x80 [amdgpu] [11181.701249] amdgpu_cs_ioctl+0x98/0x1ac0 [amdgpu] [11181.701252] ? dequeue_entity+0xd9/0x450 [11181.701282] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [11181.701293] drm_ioctl_kernel+0x59/0xb0 [drm] [11181.701302] drm_ioctl+0x2d5/0x370 [drm] [11181.701330] ? amdgpu_cs_find_mapping+0xc0/0xc0 [amdgpu] [11181.701355] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [11181.701358] do_vfs_ioctl+0xa1/0x620 [11181.701361] ? SyS_futex+0x12d/0x180 [11181.701363] SyS_ioctl+0x74/0x80 [11181.701365] entry_SYSCALL_64_fastpath+0x20/0x83 [11181.701367] RIP: 0033:0x7feb78366d27 I also get this error when I boot: amdgpu 0000:43:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff Am I "supposed" to have that? Regards, Daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180202/6063a312/attachment-0001.html>