https://bugzilla.kernel.org/show_bug.cgi?id=217141 Bug ID: 217141 Summary: [amdgpu] ring gfx_0.0.0 timeout steam deck AMD APU Product: Drivers Version: 2.5 Kernel Version: 6.1.12 Hardware: AMD OS: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@xxxxxxxxxxxxxxxxxxxx Reporter: serg@xxxxxxxxxxxxx Regression: No [ 257.182206] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=26043, emitted[64/36172] [ 257.182668] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process NMS.exe pid 2571 thread NMS.exe pid 2571 [ 257.183084] amdgpu 0000:04:00.0: amdgpu: GPU reset begin! [ 257.183094] ------------[ cut here ]------------ [ 257.183095] Evicting all processes [ 257.183151] WARNING: CPU: 6 PID: 745 at drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1935 kfd_suspend_all_proc esses+0x100/0x110 [amdgpu] [ 257.183562] Modules linked in: uinput snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm algif_aead cbc des_generi c libdes ecb md4 cmac algif_hash algif_skcipher af_alg bnep ramoops reed_solomon snd_acp5x_pcm_dma snd_soc_acp5x_mach s nd_acp5x_i2s snd_sof_amd_rembrandt rtw88_8822ce snd_sof_amd_renoir rtw88_8822c snd_sof_amd_acp rtw88_pci intel_rapl_msr snd_sof_pci intel_rapl_common rtw88_core snd_sof edac_mce_amd snd_sof_utils btusb kvm_amd btrtl snd_pci_ps mac80211 sn d_hda_codec_hdmi btbcm snd_soc_cs35l41_spi btintel kvm snd_soc_cs35l41 snd_rpl_pci_acp6x snd_hda_intel btmtk snd_soc_wm _adsp snd_intel_dspcfg cs_dsp snd_acp_pci libarc4 leds_steamdeck extcon_steamdeck snd_pci_acp6x snd_intel_sdw_acpi snd_ soc_nau8821 snd_soc_cs35l41_lib steamdeck_hwmon irqbypass bluetooth snd_hda_codec snd_pci_acp5x snd_soc_core rapl snd_r n_pci_acp3x cfg80211 pcspkr snd_hda_core snd_compress i2c_piix4 mousedev cdc_acm ac97_bus snd_acp_config joydev ecdh_ge neric snd_pcm_dmaengine snd_hwdep snd_soc_acpi [ 257.183627] snd_pci_acp3x snd_pcm dwc3_pci rfkill ina2xx_adc kfifo_buf snd_timer opt3001 ltrf216a steamdeck spi_amd ina2xx industrialio snd acpi_cpufreq mac_hid soundcore fuse ip_tables x_tables overlay ext4 crc16 mbcache jbd2 hid_ste am usbhid amdgpu vfat fat gpu_sched drm_buddy serio_raw sdhci_pci nvme_tcp drm_display_helper atkbd cqhci libps2 nvme_f abrics crct10dif_pclmul vivaldi_fmap crc32_pclmul polyval_clmulni sdhci polyval_generic cec i8042 gf128mul nvme hid_mul titouch drm_ttm_helper ghash_clmulni_intel xhci_pci sha512_ssse3 nvme_core aesni_intel crypto_simd sp5100_tco cryptd wd at_wdt ttm xhci_pci_renesas ccp mmc_core nvme_common serio video i2c_hid_acpi wmi 8250_dw i2c_hid btrfs blake2b_generic xor raid6_pq libcrc32c crc32c_generic crc32c_intel dm_mirror dm_region_hash dm_log dm_mod pkcs8_key_parser crypto_user [ 257.183700] CPU: 6 PID: 745 Comm: kworker/u32:7 Not tainted 6.1.12-valve2-1-neptune-61 #1 4091faa51bd1be3bbac5fd4c3c e3432202f24d92 [ 257.183704] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 [ 257.183708] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] [ 257.183718] RIP: 0010:kfd_suspend_all_processes+0x100/0x110 [amdgpu] [ 257.184119] Code: c7 c7 00 b3 3f c1 41 5c 41 5d e9 cb 4f 5f f1 be 03 00 00 00 e8 d1 89 a3 f1 e9 59 ff ff ff 48 c7 c7 14 a2 24 c1 e8 12 d6 06 f2 <0f> 0b e9 24 ff ff ff 0f 0b eb c5 0f 1f 44 00 00 66 0f 1f 00 0f 1f [ 257.184122] RSP: 0018:ffffad1140f67cf8 EFLAGS: 00010286 [ 257.184125] RAX: 0000000000000000 RBX: ffff993b46b68400 RCX: 0000000000000027 [ 257.184127] RDX: ffff993e6eda0728 RSI: 0000000000000001 RDI: ffff993e6eda0720 [ 257.184128] RBP: ffff993b44620000 R08: 0000000000000000 R09: ffffad1140f67b78 [ 257.184130] R10: 0000000000000003 R11: ffff993e7ef7ffe8 R12: ffffad1140f67dd0 [ 257.184131] R13: 0000000000000000 R14: ffff993b89dbe400 R15: 0000000000000000 [ 257.184133] FS: 0000000000000000(0000) GS:ffff993e6ed80000(0000) knlGS:0000000000000000 [ 257.184135] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 257.184137] CR2: 000055d62521f000 CR3: 0000000108b04000 CR4: 0000000000350ee0 [ 257.184139] Call Trace: [ 257.184143] <TASK> [ 257.184147] kgd2kfd_suspend.part.0+0x3d/0x40 [amdgpu ad613437896db6c29581f2be9152cc5a6dd35ad7] [ 257.184571] kgd2kfd_pre_reset+0x47/0x60 [amdgpu ad613437896db6c29581f2be9152cc5a6dd35ad7] [ 257.184965] amdgpu_device_gpu_recover.cold+0x119/0xb40 [amdgpu ad613437896db6c29581f2be9152cc5a6dd35ad7] [ 257.185430] amdgpu_job_timedout+0x1dc/0x220 [amdgpu ad613437896db6c29581f2be9152cc5a6dd35ad7] [ 257.185866] ? try_to_wake_up+0xd9/0x560 [ 257.185874] drm_sched_job_timedout+0x7a/0x110 [gpu_sched 32db77b2b4e1fdeaf45e32d64ce206e5c0ca90ae] [ 257.185885] process_one_work+0x1c7/0x380 [ 257.185892] worker_thread+0x51/0x390 [ 257.185897] ? rescuer_thread+0x3b0/0x3b0 [ 257.185901] kthread+0xde/0x110 [ 257.185905] ? kthread_complete_and_exit+0x20/0x20 [ 257.185909] ret_from_fork+0x22/0x30 [ 257.185917] </TASK> [ 257.185918] ---[ end trace 0000000000000000 ]--- [ 257.284610] amdgpu 0000:04:00.0: amdgpu: MODE2 reset [ 257.294783] amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-linux-neptune-61 console=tty1 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 rd.systemd.gpt_auto=no amdgpu.noretry=0 amdgpu.ppfeaturemask=0xffffbfff amdgpu.lockup_timeout=20000 amdgpu.job_hang_limit=2 drm.debug=0x1ff amdgpu.debug_evictions=true1 tsc=directsync module_blacklist=tpm log_buf_len=4M amd_iommu=off amdgpu.gttsize=8128 spi_amd.speed_dev=1 audit=0 fbcon=rotate:1 loglevel=3 splash quiet plymouth.ignore-serial-consoles fbcon=vc:4-6 steamos.efi=PARTUUID=8bdf3e52-bf2f-7c45-9f00-45e568aa5af0 Linux Thorax 6.1.12-valve2-1-neptune-61 #1 SMP PREEMPT_DYNAMIC Mon, 27 Feb 2023 21:06:42 +0000 x86_64 GNU/Linux Devices: ======== GPU0: apiVersion = 4206830 (1.3.238) driverVersion = 96469091 (0x5c00063) vendorID = 0x1002 deviceID = 0x163f deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = AMD Custom GPU 0405 (RADV VANGOGH) driverID = DRIVER_ID_MESA_RADV driverName = radv driverInfo = Mesa 23.1.0-devel (git-16283f7b97) conformanceVersion = 1.3.0.0 deviceUUID = 00000000-0400-0000-0000-000000000000 driverUUID = 414d442d-4d45-5341-2d44-525600000000 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.