Comment # 12
on bug 107065
from dwagner
(In reply to Andrey Grodzovsky from comment #10) > Created attachment 140418 [details] [review] [review] > drm/amdgpu: Verify root PD is mapped into kernel address space. > > dwagner, please try this patch. Fixes the issue for me and I observed no > suspend/resume issues. While I can start X11 with this patch applied to current amd-staging-drm-next, attempts to resume from S3 fail consistently. The following related output is emitted right before the suspend: Jul 02 21:31:32 ryzen kernel: Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done. Jul 02 21:31:32 ryzen kernel: Suspending console(s) (use no_console_suspend to debug) Jul 02 21:31:32 ryzen kernel: sd 9:0:0:0: [sda] Synchronizing SCSI cache Jul 02 21:31:32 ryzen kernel: [TTM] Buffer eviction failed Jul 02 21:31:32 ryzen kernel: ACPI: Preparing to enter system sleep state S3 Jul 02 21:31:32 ryzen kernel: PM: Saving platform NVS memory Jul 02 21:31:32 ryzen kernel: Disabling non-boot CPUs ... (I wonder if that "[TTM] Buffer eviction failed" is a bad sign - as I have seen it some other times in conjunction with heavy uses of the amdgpu driver.) Then, upon resume, the following messages are emitted: Jul 02 21:31:33 ryzen kernel: ACPI: Low-level resume complete Jul 02 21:31:33 ryzen kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400300000). Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 146 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 148 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 145 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 146 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 189 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 306 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 5e ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 18a ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 145 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 146 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 148 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 145 ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] last message was failed ret is 0 Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] failed to send message 146 ret is 0 Jul 02 21:31:33 ryzen kernel: [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xC> Jul 02 21:31:33 ryzen kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -22 Jul 02 21:31:33 ryzen kernel: [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). Jul 02 21:31:33 ryzen kernel: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -22 Jul 02 21:31:33 ryzen kernel: PM: Device 0000:0a:00.0 failed to resume async: error -22 Jul 02 21:31:33 ryzen kernel: OOM killer enabled. Jul 02 21:31:33 ryzen kernel: Restarting tasks ... done. Jul 02 21:31:33 ryzen kernel: PM: suspend exit Jul 02 21:31:33 ryzen kernel: BUG: unable to handle kernel paging request at 0000000000001000 Jul 02 21:31:33 ryzen kernel: PGD 0 P4D 0 Jul 02 21:31:33 ryzen kernel: Oops: 0002 [#1] SMP Jul 02 21:31:33 ryzen kernel: CPU: 14 PID: 791 Comm: amdgpu_cs:0 Tainted: G W O 4.18.0-rc1-amd+ #45 Jul 02 21:31:33 ryzen kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018 Jul 02 21:31:33 ryzen kernel: RIP: 0010:gmc_v8_0_set_pte_pde+0x1b/0x30 [amdgpu] Jul 02 21:31:33 ryzen kernel: Code: 80 d8 00 00 00 e9 25 78 60 e1 0f 1f 44 00 00 0f 1f 44 00 00 48 b8 00 f0 ff ff ff 00 00 0> Jul 02 21:31:33 ryzen kernel: RSP: 0018:ffffc90003e73898 EFLAGS: 00010202 Jul 02 21:31:33 ryzen kernel: RAX: 000000fffffff000 RBX: 0000000000000001 RCX: 000000000fe004f1 Jul 02 21:31:33 ryzen kernel: RDX: 0000000000001000 RSI: 0000000000001000 RDI: ffff8807e2f70000 Jul 02 21:31:33 ryzen kernel: RBP: 0000000000001000 R08: 00000000000004f1 R09: 0000000000001000 Jul 02 21:31:33 ryzen kernel: R10: ffffffffa03ac7e0 R11: ffff8807daf78000 R12: 0000000000001000 Jul 02 21:31:33 ryzen kernel: R13: 0000000000000200 R14: ffffc90003e73a18 R15: 000000000fe01000 Jul 02 21:31:33 ryzen kernel: FS: 00007f8b57266700(0000) GS:ffff88081ef80000(0000) knlGS:0000000000000000 Jul 02 21:31:33 ryzen kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000 CR3: 00000007dbbda000 CR4: 00000000003406e0 Jul 02 21:31:33 ryzen kernel: Call Trace: Jul 02 21:31:33 ryzen kernel: amdgpu_vm_cpu_set_ptes+0x76/0xe0 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_vm_update_ptes+0x1d3/0x2e0 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_vm_frag_ptes+0xae/0x130 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_vm_bo_update_mapping+0xed/0x410 [amdgpu] Jul 02 21:31:33 ryzen kernel: ? amdgpu_vm_do_copy_ptes+0xa0/0xa0 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_vm_bo_update+0x310/0x680 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_cs_ioctl+0x1092/0x1a50 [amdgpu] Jul 02 21:31:33 ryzen kernel: ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] Jul 02 21:31:33 ryzen kernel: drm_ioctl_kernel+0xa7/0xf0 [drm] Jul 02 21:31:33 ryzen kernel: drm_ioctl+0x2f1/0x3c0 [drm] Jul 02 21:31:33 ryzen kernel: ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] Jul 02 21:31:33 ryzen kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Jul 02 21:31:33 ryzen kernel: do_vfs_ioctl+0xa4/0x620 Jul 02 21:31:33 ryzen kernel: ? __se_sys_futex+0x138/0x180 Jul 02 21:31:33 ryzen kernel: ksys_ioctl+0x60/0x90 Jul 02 21:31:33 ryzen kernel: __x64_sys_ioctl+0x16/0x20 Jul 02 21:31:33 ryzen kernel: do_syscall_64+0x48/0xf0 Jul 02 21:31:33 ryzen kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 02 21:31:33 ryzen kernel: RIP: 0033:0x7f8b66c92667 Jul 02 21:31:33 ryzen kernel: Code: 00 00 90 48 8b 05 e9 67 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 8> Jul 02 21:31:33 ryzen kernel: RSP: 002b:00007f8b57265a98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Jul 02 21:31:33 ryzen kernel: RAX: ffffffffffffffda RBX: 00007f8b57265b88 RCX: 00007f8b66c92667 Jul 02 21:31:33 ryzen kernel: RDX: 00007f8b57265b00 RSI: 00000000c0186444 RDI: 000000000000000b Jul 02 21:31:33 ryzen kernel: RBP: 00007f8b57265b00 R08: 00007f8b57265bb0 R09: 0000000000000010 Jul 02 21:31:33 ryzen kernel: R10: 00007f8b57265bb0 R11: 0000000000000246 R12: 00000000c0186444 Jul 02 21:31:33 ryzen kernel: R13: 000000000000000b R14: 0000000000000002 R15: 0000000000000000 Jul 02 21:31:33 ryzen kernel: Modules linked in: it87(O) joydev mousedev hid_generic hidp hid ipt_REJECT nf_reject_ipv4 nf_l> Jul 02 21:31:33 ryzen kernel: serio_raw crc32_pclmul atkbd ghash_clmulni_intel libps2 pcbc ahci libahci xhci_pci libata aes> Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000 Jul 02 21:31:33 ryzen kernel: ---[ end trace 517a8a72887251f0 ]--- Jul 02 21:31:33 ryzen kernel: RIP: 0010:gmc_v8_0_set_pte_pde+0x1b/0x30 [amdgpu] Jul 02 21:31:33 ryzen kernel: Code: 80 d8 00 00 00 e9 25 78 60 e1 0f 1f 44 00 00 0f 1f 44 00 00 48 b8 00 f0 ff ff ff 00 00 0> Jul 02 21:31:33 ryzen kernel: RSP: 0018:ffffc90003e73898 EFLAGS: 00010202 Jul 02 21:31:33 ryzen kernel: RAX: 000000fffffff000 RBX: 0000000000000001 RCX: 000000000fe004f1 Jul 02 21:31:33 ryzen kernel: RDX: 0000000000001000 RSI: 0000000000001000 RDI: ffff8807e2f70000 Jul 02 21:31:33 ryzen kernel: RBP: 0000000000001000 R08: 00000000000004f1 R09: 0000000000001000 Jul 02 21:31:33 ryzen kernel: R10: ffffffffa03ac7e0 R11: ffff8807daf78000 R12: 0000000000001000 Jul 02 21:31:33 ryzen kernel: R13: 0000000000000200 R14: ffffc90003e73a18 R15: 000000000fe01000 Jul 02 21:31:33 ryzen kernel: FS: 00007f8b57266700(0000) GS:ffff88081ef80000(0000) knlGS:0000000000000000 Jul 02 21:31:33 ryzen kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000 CR3: 00000007dbbda000 CR4: 00000000003406e0 (At this point, the machine is just dead, and reacts upon nothing.) So something is still wrong at amdgpu_vm_cpu_set_ptes+0x76
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel