Comment # 114
on bug 109955
from Rodney A Morris
To rule out possible hardware issues, I purchased another Vega 64 card. This time a factory overclocked card. Since installing the card, I have experienced three lock ups. Two playing Stellaris and one while playing a youtube video. After playing Stellaris without issue two weeks ago, the computer locked up twice last night. While my previous problems seemed to be, in part, linked to a circular lock dependence, the last logs indicate something different. I'm seeing a lot of powerplay errors after the fence timeout. Hope this new information provides some insight into the problem. /:-------------:\ rmorris@ezra.blanchardmorris.net :-------------------:: -------------------------------- :-----------/shhOHbmp---:\ OS: Fedora release 30 (Thirty) x86_64 /-----------omMMMNNNMMD ---: Kernel: 5.3.6-200.fc30.x86_64 :-----------sMMMMNMNMP. ---: Uptime: 16 hours, 21 mins :-----------:MMMdP------- ---\ Packages: 2214 (rpm), 36 (flatpak) ,------------:MMMd-------- ---: Shell: bash 5.0.7 :------------:MMMd------- .---: Resolution: 2560x1440 :---- oNMMMMMMMMMNho .----: DE: GNOME 3.32.2 :-- .+shhhMMMmhhy++ .------/ WM: Mutter :- -------:MMMd--------------: WM Theme: Adwaita :- --------/MMMd-------------; Theme: Adapta-Nokto-Eta [GTK2/3] :- ------/hMMMy------------: Icons: Adwaita [GTK2/3] :-- :dMNdhhdNMMNo------------; Terminal: tilix :---:sdNMMMMNds:------------: CPU: Intel i7-6850K (12) @ 4.000GHz :------:://:-------------:: GPU: AMD ATI Radeon RX Vega 56/64 :---------------------:// Memory: 2814MiB / 32036MiB Card: MSI Vega 64 OC (Card works fine under windows 10) Game being played: Stellaris Native Game Description of Event: Screen goes blank and music and sound continues to play before computer locks up or reboots. relevant dmesg from crash: [ 4244.670269] perf: interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 4298.241156] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! [ 4304.385587] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring page1 timeout, signaled seq=60549844, emitted seq=60549846 [ 4304.385634] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 [ 4304.385637] amdgpu 0000:06:00.0: GPU reset begin! [ 4304.402938] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4304.402945] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4304.402947] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4304.402948] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4304.404006] pcieport 0000:00:03.0: AER: Device recovery failed [ 4308.481068] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out [ 4314.625180] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:47:crtc-0] hw_done or flip_done timed out [ 4324.865057] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out [ 4335.105035] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:45:plane-5] flip_done timed out [ 4336.695112] amdgpu: [powerplay] No response from smu [ 4336.695128] amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0x0 [ 4338.307125] amdgpu: [powerplay] No response from smu [ 4339.922039] amdgpu: [powerplay] No response from smu [ 4339.922043] amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0x0 [ 4341.541675] amdgpu: [powerplay] No response from smu [ 4343.162102] amdgpu: [powerplay] No response from smu [ 4343.162105] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0x0 [ 4343.221953] [drm] REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:634 [ 4343.221962] ------------[ cut here ]------------ [ 4343.222070] WARNING: CPU: 0 PID: 16500 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:332 generic_reg_wait.cold+0x31/0x53 [amdgpu] [ 4343.222072] Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables cmac bnep nct6775 hwmon_vid intel_rapl_msr intel_rapl_common vfat fat fuse x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm kvm_intel iTCO_wdt iTCO_vendor_support mac80211 kvm snd_hda_codec_realtek irqbypass snd_hda_codec_generic snd_hda_codec_hdmi libarc4 ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul iwlwifi snd_hda_codec snd_hda_core btusb ghash_clmulni_intel btrtl intel_cstate snd_hwdep btbcm btintel intel_uncore snd_seq snd_seq_device intel_rapl_perf bluetooth [ 4343.222099] mxm_wmi cfg80211 snd_pcm joydev ecdh_generic ecc mei_me snd_timer rfkill snd mei i2c_i801 soundcore lpc_ich binfmt_misc auth_rpcgss sunrpc amdgpu amd_iommu_v2 gpu_sched ttm drm_kms_helper crc32c_intel uas mpt3sas igb drm e1000e nvme usb_storage dca i2c_algo_bit raid_class nvme_core scsi_transport_sas wmi [ 4343.222114] CPU: 0 PID: 16500 Comm: kworker/0:1 Not tainted 5.3.6-200.fc30.x86_64+debug #1 [ 4343.222115] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Taichi, BIOS P1.80 04/06/2018 [ 4343.222119] Workqueue: events drm_sched_job_timedout [gpu_sched] [ 4343.222167] RIP: 0010:generic_reg_wait.cold+0x31/0x53 [amdgpu] [ 4343.222169] Code: 4c 24 18 44 89 fa 89 ee 48 c7 c7 f8 9d 73 c0 e8 60 46 b0 fa 83 7b 20 01 0f 84 02 ee fd ff 48 c7 c7 f0 9c 73 c0 e8 4a 46 b0 fa <0f> 0b e9 ef ed fd ff 48 c7 c7 f0 9c 73 c0 89 54 24 04 e8 33 46 b0 [ 4343.222170] RSP: 0018:ffffabda8729b690 EFLAGS: 00010246 [ 4343.222172] RAX: 0000000000000024 RBX: ffff9ceeab58f700 RCX: 0000000000000006 [ 4343.222173] RDX: 0000000000000000 RSI: ffff9ceeb50c8e50 RDI: ffff9ceebe5d9e00 [ 4343.222174] RBP: 000000000000000a R08: 000003f33c33ca38 R09: 0000000000000000 [ 4343.222175] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000035af [ 4343.222176] R13: 0000000000000dad R14: 0000000000000001 R15: 0000000000000dac [ 4343.222178] FS: 0000000000000000(0000) GS:ffff9ceebe400000(0000) knlGS:0000000000000000 [ 4343.222179] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4343.222180] CR2: 00007f1480ef70c0 CR3: 0000000703f30002 CR4: 00000000003606f0 [ 4343.222182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4343.222183] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4343.222184] Call Trace: [ 4343.222237] dce_mi_free_dmif+0xef/0x150 [amdgpu] [ 4343.222285] dce110_reset_hw_ctx_wrap+0x15f/0x200 [amdgpu] [ 4343.222333] dce110_apply_ctx_to_hw+0x4b/0x530 [amdgpu] [ 4343.222365] ? amdgpu_pm_compute_clocks+0xc9/0x5f0 [amdgpu] [ 4343.222414] ? dm_pp_apply_display_requirements+0x1a8/0x1c0 [amdgpu] [ 4343.222461] dc_commit_state+0x26b/0x590 [amdgpu] [ 4343.222514] amdgpu_dm_atomic_commit_tail+0xd18/0x1cf0 [amdgpu] [ 4343.222521] ? __lock_acquire+0x247/0x1910 [ 4343.222525] ? find_held_lock+0x32/0x90 [ 4343.222529] ? find_held_lock+0x32/0x90 [ 4343.222533] ? sched_clock+0x5/0x10 [ 4343.222536] ? mark_held_locks+0x50/0x80 [ 4343.222540] ? __lock_acquire+0x247/0x1910 [ 4343.222545] ? wake_up_klogd+0x37/0x40 [ 4343.222549] ? find_held_lock+0x32/0x90 [ 4343.222552] ? mark_held_locks+0x50/0x80 [ 4343.222556] ? _raw_spin_unlock_irq+0x29/0x40 [ 4343.222559] ? lockdep_hardirqs_on+0xf0/0x180 [ 4343.222561] ? _raw_spin_unlock_irq+0x29/0x40 [ 4343.222564] ? wait_for_completion_timeout+0x75/0x190 [ 4343.222576] ? commit_tail+0x3c/0x70 [drm_kms_helper] [ 4343.222622] ? amdgpu_dm_audio_eld_notify+0x60/0x60 [amdgpu] [ 4343.222628] commit_tail+0x3c/0x70 [drm_kms_helper] [ 4343.222634] drm_atomic_helper_commit+0xe3/0x150 [drm_kms_helper] [ 4343.222640] drm_atomic_helper_disable_all+0x14c/0x160 [drm_kms_helper] [ 4343.222647] drm_atomic_helper_suspend+0x66/0x100 [drm_kms_helper] [ 4343.222698] dm_suspend+0x20/0x60 [amdgpu] [ 4343.222726] amdgpu_device_ip_suspend_phase1+0x91/0xc0 [amdgpu] [ 4343.222755] amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu] [ 4343.222801] amdgpu_device_pre_asic_reset+0x191/0x1a4 [amdgpu] [ 4343.222849] amdgpu_device_gpu_recover+0x260/0x934 [amdgpu] [ 4343.222893] amdgpu_job_timedout+0x115/0x140 [amdgpu] [ 4343.222899] drm_sched_job_timedout+0x44/0xa0 [gpu_sched] [ 4343.222903] process_one_work+0x272/0x5a0 [ 4343.222908] worker_thread+0x50/0x3b0 [ 4343.222915] kthread+0x108/0x140 [ 4343.222916] ? process_one_work+0x5a0/0x5a0 [ 4343.222918] ? kthread_park+0x80/0x80 [ 4343.222921] ret_from_fork+0x3a/0x50 [ 4343.222929] irq event stamp: 82808 [ 4343.222931] hardirqs last enabled at (82807): [<ffffffffbb1716eb>] console_unlock+0x46b/0x5d0 [ 4343.222935] hardirqs last disabled at (82808): [<ffffffffbb0038da>] trace_hardirqs_off_thunk+0x1a/0x20 [ 4343.222938] softirqs last enabled at (82794): [<ffffffffbbe0035d>] __do_softirq+0x35d/0x45d [ 4343.222942] softirqs last disabled at (82787): [<ffffffffbb0f2077>] irq_exit+0xf7/0x100 [ 4343.222943] ---[ end trace 71731c9cc205c24d ]--- [ 4344.758203] amdgpu: [powerplay] No response from smu [ 4346.363061] amdgpu: [powerplay] No response from smu [ 4346.363065] amdgpu: [powerplay] Failed to send message: 0x26, ret value: 0x0 [ 4347.973948] amdgpu: [powerplay] No response from smu [ 4349.588168] amdgpu: [powerplay] No response from smu [ 4349.588173] amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x1, error code: 0x0 [ 4351.152764] amdgpu: [powerplay] No response from smu [ 4352.722063] amdgpu: [powerplay] No response from smu [ 4352.722068] amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x3, error code: 0x0 [ 4354.325541] amdgpu: [powerplay] No response from smu [ 4355.924138] amdgpu: [powerplay] No response from smu [ 4355.924141] amdgpu: [powerplay] Failed to send message: 0x63, ret value: 0x0 [ 4357.537736] amdgpu: [powerplay] No response from smu [ 4359.154141] amdgpu: [powerplay] No response from smu [ 4359.154146] amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0x0 [ 4360.760856] amdgpu: [powerplay] No response from smu [ 4362.372410] amdgpu: [powerplay] No response from smu [ 4362.372414] amdgpu: [powerplay] Failed message: 0xa, input parameter: 0xa0b000, error code: 0x0 [ 4363.985961] amdgpu: [powerplay] No response from smu [ 4365.599325] amdgpu: [powerplay] No response from smu [ 4365.599331] amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0x0 [ 4367.214945] amdgpu: [powerplay] No response from smu [ 4368.829650] amdgpu: [powerplay] No response from smu [ 4368.829655] amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0x0 [ 4370.443783] amdgpu: [powerplay] No response from smu [ 4372.057288] amdgpu: [powerplay] No response from smu [ 4372.057293] amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0x0 [ 4372.074301] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.074308] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.074310] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.074312] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.074569] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.091832] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.091837] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.091839] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.091840] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.091889] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.109371] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.109376] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.109378] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.109380] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.126998] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.127002] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.127009] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.127021] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.127024] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.127083] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.144452] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.144457] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.144458] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.144460] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.144514] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.161992] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.161997] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.161999] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.162001] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.162086] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.179534] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.179538] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.179540] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.179542] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.179674] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.197074] pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.197079] pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ 4372.197081] pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 [ 4372.197082] pcieport 0000:00:03.0: AER: [14] CmpltTO (First) [ 4372.197131] pcieport 0000:00:03.0: AER: Device recovery failed [ 4372.214616] pcieport 0000:00:03.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:00:03.0 [ 4372.267239] amdgpu: [powerplay] Failed to send message: 0x61, ret value: 0xffffffff Relevant journalctl messages: Oct 18 21:49:47 ezra.blanchardmorris.net kernel: perf: interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 Oct 18 21:50:47 ezra.blanchardmorris.net kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out or interrupted! Oct 18 21:50:47 ezra.blanchardmorris.net kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring page1 timeout, signaled seq=60549844, emitted seq=60549846 Oct 18 21:50:47 ezra.blanchardmorris.net kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 Oct 18 21:50:47 ezra.blanchardmorris.net kernel: amdgpu 0000:06:00.0: GPU reset begin! Oct 18 21:50:47 ezra.blanchardmorris.net kernel: pcieport 0000:00:03.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.0 Oct 18 21:50:47 ezra.blanchardmorris.net kernel: pcieport 0000:00:03.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) Oct 18 21:50:47 ezra.blanchardmorris.net kernel: pcieport 0000:00:03.0: AER: device [8086:6f08] error status/mask=00004000/00000000 Oct 18 21:50:47 ezra.blanchardmorris.net kernel: pcieport 0000:00:03.0: AER: [14] CmpltTO (First) Oct 18 21:50:47 ezra.blanchardmorris.net kernel: pcieport 0000:00:03.0: AER: Device recovery failed Oct 18 21:50:51 ezra.blanchardmorris.net kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out Oct 18 21:50:57 ezra.blanchardmorris.net kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:47:crtc-0] hw_done or flip_done timed out Oct 18 21:51:07 ezra.blanchardmorris.net kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out Oct 18 21:51:18 ezra.blanchardmorris.net kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:45:plane-5] flip_done timed out Oct 18 21:51:19 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:19 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0xe, input parameter: 0x0, error code: 0x0 Oct 18 21:51:21 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:22 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:22 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0x42, input parameter: 0x1, error code: 0x0 Oct 18 21:51:24 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0x24, input parameter: 0x0, error code: 0x0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: [drm] REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:634 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ------------[ cut here ]------------ Oct 18 21:51:26 ezra.blanchardmorris.net kernel: WARNING: CPU: 0 PID: 16500 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:332 generic_reg_wait.cold+0x31/0x53 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: Modules linked in: rfcomm xt_CHECKSUM xt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables cmac bnep nct6775 hwmon_vid intel_rapl_msr intel_rapl_common vfat fat fuse x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm kvm_intel iTCO_wdt iTCO_vendor_support mac80211 kvm snd_hda_codec_realtek irqbypass snd_hda_codec_generic snd_hda_codec_hdmi libarc4 ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul iwlwifi snd_hda_codec snd_hda_core btusb ghash_clmulni_intel btrtl intel_cstate snd_hwdep btbcm btintel intel_uncore snd_seq snd_seq_device intel_rapl_perf bluetooth Oct 18 21:51:26 ezra.blanchardmorris.net kernel: mxm_wmi cfg80211 snd_pcm joydev ecdh_generic ecc mei_me snd_timer rfkill snd mei i2c_i801 soundcore lpc_ich binfmt_misc auth_rpcgss sunrpc amdgpu amd_iommu_v2 gpu_sched ttm drm_kms_helper crc32c_intel uas mpt3sas igb drm e1000e nvme usb_storage dca i2c_algo_bit raid_class nvme_core scsi_transport_sas wmi Oct 18 21:51:26 ezra.blanchardmorris.net kernel: CPU: 0 PID: 16500 Comm: kworker/0:1 Not tainted 5.3.6-200.fc30.x86_64+debug #1 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Taichi, BIOS P1.80 04/06/2018 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: Workqueue: events drm_sched_job_timedout [gpu_sched] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: RIP: 0010:generic_reg_wait.cold+0x31/0x53 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: Code: 4c 24 18 44 89 fa 89 ee 48 c7 c7 f8 9d 73 c0 e8 60 46 b0 fa 83 7b 20 01 0f 84 02 ee fd ff 48 c7 c7 f0 9c 73 c0 e8 4a 46 b0 fa <0f> 0b e9 ef ed fd ff 48 c7 c7 f0 9c 73 c0 89 54 24 04 e8 33 46 b0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: RSP: 0018:ffffabda8729b690 EFLAGS: 00010246 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: RAX: 0000000000000024 RBX: ffff9ceeab58f700 RCX: 0000000000000006 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: RDX: 0000000000000000 RSI: ffff9ceeb50c8e50 RDI: ffff9ceebe5d9e00 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: RBP: 000000000000000a R08: 000003f33c33ca38 R09: 0000000000000000 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000035af Oct 18 21:51:26 ezra.blanchardmorris.net kernel: R13: 0000000000000dad R14: 0000000000000001 R15: 0000000000000dac Oct 18 21:51:26 ezra.blanchardmorris.net kernel: FS: 0000000000000000(0000) GS:ffff9ceebe400000(0000) knlGS:0000000000000000 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: CR2: 00007f1480ef70c0 CR3: 0000000703f30002 CR4: 00000000003606f0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: Call Trace: Oct 18 21:51:26 ezra.blanchardmorris.net kernel: dce_mi_free_dmif+0xef/0x150 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: dce110_reset_hw_ctx_wrap+0x15f/0x200 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: dce110_apply_ctx_to_hw+0x4b/0x530 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? amdgpu_pm_compute_clocks+0xc9/0x5f0 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? dm_pp_apply_display_requirements+0x1a8/0x1c0 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: dc_commit_state+0x26b/0x590 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_dm_atomic_commit_tail+0xd18/0x1cf0 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? __lock_acquire+0x247/0x1910 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? find_held_lock+0x32/0x90 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? find_held_lock+0x32/0x90 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? sched_clock+0x5/0x10 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? mark_held_locks+0x50/0x80 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? __lock_acquire+0x247/0x1910 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? wake_up_klogd+0x37/0x40 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? find_held_lock+0x32/0x90 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? mark_held_locks+0x50/0x80 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? _raw_spin_unlock_irq+0x29/0x40 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? lockdep_hardirqs_on+0xf0/0x180 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? _raw_spin_unlock_irq+0x29/0x40 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? wait_for_completion_timeout+0x75/0x190 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? commit_tail+0x3c/0x70 [drm_kms_helper] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? amdgpu_dm_audio_eld_notify+0x60/0x60 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: commit_tail+0x3c/0x70 [drm_kms_helper] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: drm_atomic_helper_commit+0xe3/0x150 [drm_kms_helper] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: drm_atomic_helper_disable_all+0x14c/0x160 [drm_kms_helper] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: drm_atomic_helper_suspend+0x66/0x100 [drm_kms_helper] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: dm_suspend+0x20/0x60 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_device_ip_suspend_phase1+0x91/0xc0 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_device_pre_asic_reset+0x191/0x1a4 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_device_gpu_recover+0x260/0x934 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: amdgpu_job_timedout+0x115/0x140 [amdgpu] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: drm_sched_job_timedout+0x44/0xa0 [gpu_sched] Oct 18 21:51:26 ezra.blanchardmorris.net kernel: process_one_work+0x272/0x5a0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: worker_thread+0x50/0x3b0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: kthread+0x108/0x140 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? process_one_work+0x5a0/0x5a0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ? kthread_park+0x80/0x80 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ret_from_fork+0x3a/0x50 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: irq event stamp: 82808 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: hardirqs last enabled at (82807): [<ffffffffbb1716eb>] console_unlock+0x46b/0x5d0 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: hardirqs last disabled at (82808): [<ffffffffbb0038da>] trace_hardirqs_off_thunk+0x1a/0x20 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: softirqs last enabled at (82794): [<ffffffffbbe0035d>] __do_softirq+0x35d/0x45d Oct 18 21:51:26 ezra.blanchardmorris.net kernel: softirqs last disabled at (82787): [<ffffffffbb0f2077>] irq_exit+0xf7/0x100 Oct 18 21:51:26 ezra.blanchardmorris.net kernel: ---[ end trace 71731c9cc205c24d ]--- Oct 18 21:51:27 ezra.blanchardmorris.net abrt-dump-journal-oops[1493]: abrt-dump-journal-oops: Found oopses: 1 Oct 18 21:51:27 ezra.blanchardmorris.net abrt-dump-journal-oops[1493]: abrt-dump-journal-oops: Creating problem directories Oct 18 21:51:27 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:28 ezra.blanchardmorris.net abrt-dump-journal-oops[1493]: Reported 1 kernel oopses to Abrt Oct 18 21:51:29 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:29 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed to send message: 0x26, ret value: 0x0 Oct 18 21:51:30 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:32 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:32 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x1, error code: 0x0 Oct 18 21:51:34 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:35 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:35 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0x4c, input parameter: 0x3, error code: 0x0 Oct 18 21:51:37 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:38 ezra.blanchardmorris.net abrt-server[16691]: Can't find a meaningful backtrace for hashing in '.' Oct 18 21:51:38 ezra.blanchardmorris.net abrt-server[16691]: Option 'DropNotReportableOopses' is not configured Oct 18 21:51:38 ezra.blanchardmorris.net abrt-server[16691]: Preserving oops '.' because DropNotReportableOopses is 'no' Oct 18 21:51:38 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:38 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed to send message: 0x63, ret value: 0x0 Oct 18 21:51:40 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:42 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:42 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] Failed message: 0x9, input parameter: 0xf4, error code: 0x0 Oct 18 21:51:42 ezra.blanchardmorris.net abrt-notification[16713]: System encountered a non-fatal error in ??() Oct 18 21:51:43 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu Oct 18 21:51:45 ezra.blanchardmorris.net kernel: amdgpu: [powerplay] No response from smu
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel