On 2017 Nov 22, Martin Babutzka wrote: >Dear AMD Developers, >At first congratulations for the DC code submission to the 4.15 kernel. >Unfortunately the major regression which I reported on 29.09., 06.10., >02.11. and 05.11. still exists. But this time I got additional >debugging information maybe this helps to fix it. > >Summary: I am running Xubuntu 17.10 with the amd-staging-drm-next >kernel patched to 4.14.0. The latest build which I tested is from >includes all commits up to now (including 2017-11-17 19:51:57 (GMT) >commit 85d09ce5e5039644487e9508d6359f9f4cf64427). > >Some vblank operations make the kernel crash and hang up the whole >system. The error is reproducible by enabling the screen lock or the >suspend mode. The system can not return to proper state from either of >these (after all I am not 100% sure it is the same error). Debugging is > easier with screen lock. Attached you can find the kernel crash and >the dce110_vblank_set function modified by some kernel prints. It looks >like the function is called twice and does not work the second time. >The whole code around dce110_vblank_set also looks interrupt-ish - >could this be a race condition or timing problem? Objects being cleared >from memory and then accessed by dce110_vblank_set? > >Bug reports on this issue: >https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/37 >https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/29 > >Many regards, >Martin (M-bab) I'm having the same problem on Carrizo. The system crashes when resuming from S3 and dc is on. With dc off, everything works fine. I was able to catch some debug info with kasan: Nov 22 15:52:19 probook kernel: PM: suspend entry (deep) Nov 22 15:52:19 probook kernel: PM: Syncing filesystems ... done. Nov 22 15:52:28 probook kernel: Freezing user space processes ... (elapsed 0.002 seconds) done. Nov 22 15:52:28 probook kernel: OOM killer disabled. Nov 22 15:52:28 probook kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. Nov 22 15:52:28 probook kernel: Suspending console(s) (use no_console_suspend to debug) Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Stopping disk Nov 22 15:52:28 probook kernel: amdgpu 0000:00:01.0: ffff8803e8075500 unpin not necessary Nov 22 15:52:28 probook kernel: ACPI: Preparing to enter system sleep state S3 Nov 22 15:52:28 probook kernel: ACPI: EC: event blocked Nov 22 15:52:28 probook kernel: ACPI: EC: EC stopped Nov 22 15:52:28 probook kernel: PM: Saving platform NVS memory Nov 22 15:52:28 probook kernel: Disabling non-boot CPUs ... Nov 22 15:52:28 probook kernel: smpboot: CPU 1 is now offline Nov 22 15:52:28 probook kernel: smpboot: CPU 2 is now offline Nov 22 15:52:28 probook kernel: smpboot: CPU 3 is now offline Nov 22 15:52:28 probook kernel: ACPI: Low-level resume complete Nov 22 15:52:28 probook kernel: ACPI: EC: EC started Nov 22 15:52:28 probook kernel: PM: Restoring platform NVS memory Nov 22 15:52:28 probook kernel: LVT offset 0 assigned for vector 0x400 Nov 22 15:52:28 probook kernel: Enabling non-boot CPUs ... Nov 22 15:52:28 probook kernel: x86: Booting SMP configuration: Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 1 APIC 0x11 Nov 22 15:52:28 probook kernel: cache: parent cpu1 should not be sleeping Nov 22 15:52:28 probook kernel: CPU1 is up Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 2 APIC 0x12 Nov 22 15:52:28 probook kernel: cache: parent cpu2 should not be sleeping Nov 22 15:52:28 probook kernel: CPU2 is up Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 3 APIC 0x13 Nov 22 15:52:28 probook kernel: cache: parent cpu3 should not be sleeping Nov 22 15:52:28 probook kernel: CPU3 is up Nov 22 15:52:28 probook kernel: ACPI: Waking up from system sleep state S3 Nov 22 15:52:28 probook kernel: ACPI: EC: event unblocked Nov 22 15:52:28 probook kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400040000). Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Starting disk Nov 22 15:52:28 probook kernel: r8169 0000:01:00.0 enp1s0: link down Nov 22 15:52:28 probook kernel: ACPI: button: The lid device is not compliant to SW_LID. Nov 22 15:52:28 probook kernel: usb 3-1.1: reset high-speed USB device number 3 using ehci-pci Nov 22 15:52:28 probook kernel: [drm:hwss_wait_for_blank_complete] *ERROR* DC: failed to blank crtc! Nov 22 15:52:28 probook kernel: [drm] ring test on 0 succeeded in 11 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 9 succeeded in 8 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 1 succeeded in 4 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 2 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 3 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 4 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 5 succeeded in 7 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 6 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 7 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 8 succeeded in 2 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 10 succeeded in 4 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 11 succeeded in 3 usecs Nov 22 15:52:28 probook kernel: usb 3-1.3: reset high-speed USB device number 4 using ehci-pci Nov 22 15:52:28 probook kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:e0 (SECURITY FREEZE LOCK) filtered out Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:e0 (DEVICE CONFIGURATION OVERLAY) filtered out Nov 22 15:52:28 probook kernel: ata1.00: supports DRM functions and may not be fully accessible Nov 22 15:52:28 probook kernel: ata1.00: disabling queued TRIM support Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:e0 (SECURITY FREEZE LOCK) filtered out Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:e0 (DEVICE CONFIGURATION OVERLAY) filtered out Nov 22 15:52:28 probook kernel: ata1.00: supports DRM functions and may not be fully accessible Nov 22 15:52:28 probook kernel: ata1.00: disabling queued TRIM support Nov 22 15:52:28 probook kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Nov 22 15:52:28 probook kernel: ata1.00: configured for UDMA/133 Nov 22 15:52:28 probook kernel: ata2.00: configured for UDMA/100 Nov 22 15:52:28 probook kernel: usb 3-1.3.2: reset full-speed USB device number 5 using ehci-pci Nov 22 15:52:28 probook kernel: [drm] ring test on 12 succeeded in 1 usecs Nov 22 15:52:28 probook kernel: [drm] UVD initialized successfully. Nov 22 15:52:28 probook kernel: [drm] ring test on 13 succeeded in 0 usecs Nov 22 15:52:28 probook kernel: [drm] ring test on 14 succeeded in 8 usecs Nov 22 15:52:28 probook kernel: [drm] VCE initialized successfully. Nov 22 15:52:28 probook kernel: [drm] ib test on ring 0 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 1 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 2 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 3 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 4 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 5 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 6 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 7 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 8 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 9 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 10 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 11 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 12 succeeded Nov 22 15:52:28 probook kernel: [drm] ib test on ring 13 succeeded Nov 22 15:52:28 probook kernel: [drm] {1920x1080, 2250x1132 at 152840Khz} Nov 22 15:52:28 probook kernel: [drm] HBRx2 pass VS=1, PE=0 Nov 22 15:52:28 probook kernel: ------------[ cut here ]------------ Nov 22 15:52:28 probook kernel: Kernel BUG at ffffffffb522a5c9 [verbose debug info unavailable] Nov 22 15:52:28 probook kernel: ================================================================== Nov 22 15:52:28 probook kernel: BUG: KASAN: use-after-free in ex_handler_refcount+0x135/0x170 Nov 22 15:52:28 probook kernel: Write of size 4 at addr ffff8803e1be7840 by task kworker/u8:22/2619 Nov 22 15:52:28 probook kernel: Nov 22 15:52:28 probook kernel: CPU: 3 PID: 2619 Comm: kworker/u8:22 Not tainted 4.14.0-11095-g0c86a6bd85ff #404 Nov 22 15:52:28 probook kernel: Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.09 06/09/2017 Nov 22 15:52:28 probook kernel: Workqueue: events_unbound async_run_entry_fn Nov 22 15:52:28 probook kernel: Call Trace: Nov 22 15:52:28 probook kernel: dump_stack+0x99/0x11e Nov 22 15:52:28 probook kernel: ? _atomic_dec_and_lock+0x152/0x152 Nov 22 15:52:28 probook kernel: print_address_description+0x65/0x270 Nov 22 15:52:28 probook kernel: kasan_report+0x272/0x360 Nov 22 15:52:28 probook kernel: ? ex_handler_refcount+0x135/0x170 Nov 22 15:52:28 probook kernel: ex_handler_refcount+0x135/0x170 Nov 22 15:52:28 probook kernel: ? ex_handler_clear_fs+0xa0/0xa0 Nov 22 15:52:28 probook kernel: fixup_exception+0x78/0xb0 Nov 22 15:52:28 probook kernel: do_trap+0x11c/0x380 Nov 22 15:52:28 probook kernel: do_error_trap+0x11c/0x350 Nov 22 15:52:28 probook kernel: ? fixup_bug.part.10+0x80/0x80 Nov 22 15:52:28 probook kernel: ? csum_partial_copy_generic+0x1309/0x2880 Nov 22 15:52:28 probook kernel: ? kasan_slab_free+0x87/0xc0 Nov 22 15:52:28 probook kernel: ? drm_atomic_helper_resume+0xbf/0x120 Nov 22 15:52:28 probook kernel: invalid_op+0x18/0x20 Nov 22 15:52:28 probook kernel: RIP: 0010:csum_partial_copy_generic+0x1309/0x2880 Nov 22 15:52:28 probook kernel: RSP: 0018:ffff8803d0c2f150 EFLAGS: 00010296 Nov 22 15:52:28 probook kernel: RAX: dffffc0000000000 RBX: ffff8803eecb0000 RCX: ffff8803e1be7840 Nov 22 15:52:28 probook kernel: RDX: 1ffff1007dd973b3 RSI: ffff8803d0c2f0f0 RDI: ffff8803e1be7840 Nov 22 15:52:28 probook kernel: RBP: ffff8803eecb9d98 R08: ffff8803f2db9348 R09: ffffffffb62c0aba Nov 22 15:52:28 probook kernel: R10: 1ffff1007a185da1 R11: 1ffff1007a185e1f R12: dffffc0000000000 Nov 22 15:52:28 probook kernel: R13: 0000000000000000 R14: ffffed007a185e3b R15: ffff8803f2db9100 Nov 22 15:52:28 probook kernel: ? amdgpu_dm_update_connector_after_detect+0x650/0x650 Nov 22 15:52:28 probook kernel: amdgpu_device_resume+0x7d3/0x910 Nov 22 15:52:28 probook kernel: ? amdgpu_device_suspend+0xa20/0xa20 Nov 22 15:52:28 probook kernel: ? preempt_count_add+0xb9/0x140 Nov 22 15:52:28 probook kernel: ? pci_pm_freeze+0x310/0x310 Nov 22 15:52:28 probook kernel: dpm_run_callback+0xcb/0x460 Nov 22 15:52:28 probook kernel: ? initcall_debug_report.isra.8+0xe0/0xe0 Nov 22 15:52:28 probook kernel: ? __wake_up_common+0x650/0x650 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irqrestore+0xc2/0x130 Nov 22 15:52:28 probook kernel: device_resume+0x165/0x470 Nov 22 15:52:28 probook kernel: ? async_run_entry_fn+0x41a/0x690 Nov 22 15:52:28 probook kernel: ? device_resume+0x470/0x470 Nov 22 15:52:28 probook kernel: async_resume+0x14/0x40 Nov 22 15:52:28 probook kernel: async_run_entry_fn+0x16b/0x690 Nov 22 15:52:28 probook kernel: ? sched_clock_cpu+0x18/0x1e0 Nov 22 15:52:28 probook kernel: ? sched_clock_cpu+0x18/0x1e0 Nov 22 15:52:28 probook kernel: ? lowest_in_progress+0x190/0x190 Nov 22 15:52:28 probook kernel: ? pick_next_entity+0x194/0x400 Nov 22 15:52:28 probook kernel: ? pwq_dec_nr_in_flight+0x1ab/0x3c0 Nov 22 15:52:28 probook kernel: ? kthread_create_on_node+0x8b/0xc0 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irq+0xbe/0x120 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock+0x120/0x120 Nov 22 15:52:28 probook kernel: process_one_work+0x84b/0x1600 Nov 22 15:52:28 probook kernel: ? tick_nohz_dep_clear_signal+0x20/0x20 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irq+0xbe/0x120 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock+0x120/0x120 Nov 22 15:52:28 probook kernel: ? pwq_dec_nr_in_flight+0x3c0/0x3c0 Nov 22 15:52:28 probook kernel: ? arch_vtime_task_switch+0xee/0x190 Nov 22 15:52:28 probook kernel: ? finish_task_switch+0x27d/0x7f0 Nov 22 15:52:28 probook kernel: ? wq_worker_waking_up+0xc0/0xc0 Nov 22 15:52:28 probook kernel: ? copy_overflow+0x20/0x20 Nov 22 15:52:28 probook kernel: ? pci_mmcfg_check_reserved+0x100/0x100 Nov 22 15:52:28 probook kernel: ? pointer+0x8d0/0x8d0 Nov 22 15:52:28 probook kernel: ? remove_wait_queue+0x2b0/0x2b0 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irqrestore+0xc2/0x130 Nov 22 15:52:28 probook kernel: ? preempt_count_add+0xb9/0x140 Nov 22 15:52:28 probook kernel: ? trace_raw_output_tick_stop+0x110/0x110 Nov 22 15:52:28 probook kernel: ? schedule+0xfb/0x3b0 Nov 22 15:52:28 probook kernel: ? __schedule+0x19b0/0x19b0 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irq+0xbe/0x120 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock+0x120/0x120 Nov 22 15:52:28 probook kernel: ? task_change_group_fair+0x7e0/0x7e0 Nov 22 15:52:28 probook kernel: worker_thread+0x211/0x1790 Nov 22 15:52:28 probook kernel: ? unwind_next_frame+0x939/0x1e50 Nov 22 15:52:28 probook kernel: ? trace_event_raw_event_workqueue_work+0x170/0x170 Nov 22 15:52:28 probook kernel: ? __read_once_size_nocheck.constprop.6+0x10/0x10 Nov 22 15:52:28 probook kernel: ? tick_nohz_dep_clear_signal+0x20/0x20 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irq+0xbe/0x120 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock+0x120/0x120 Nov 22 15:52:28 probook kernel: ? compat_start_thread+0x70/0x70 Nov 22 15:52:28 probook kernel: ? finish_task_switch+0x27d/0x7f0 Nov 22 15:52:28 probook kernel: ? sched_clock_cpu+0x18/0x1e0 Nov 22 15:52:28 probook kernel: ? ret_from_fork+0x1f/0x30 Nov 22 15:52:28 probook kernel: ? pci_mmcfg_check_reserved+0x100/0x100 Nov 22 15:52:28 probook kernel: ? schedule+0xfb/0x3b0 Nov 22 15:52:28 probook kernel: ? __schedule+0x19b0/0x19b0 Nov 22 15:52:28 probook kernel: ? remove_wait_queue+0x2b0/0x2b0 Nov 22 15:52:28 probook kernel: ? memcg_kmem_get_cache+0x890/0x890 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irqrestore+0xc2/0x130 Nov 22 15:52:28 probook kernel: ? _raw_spin_unlock_irq+0x120/0x120 Nov 22 15:52:28 probook kernel: ? trace_event_raw_event_workqueue_work+0x170/0x170 Nov 22 15:52:28 probook kernel: kthread+0x2d4/0x390 Nov 22 15:52:28 probook kernel: ? kthread_create_worker+0xd0/0xd0 Nov 22 15:52:28 probook kernel: ret_from_fork+0x1f/0x30 Nov 22 15:52:28 probook kernel: Nov 22 15:52:28 probook kernel: Allocated by task 2607: Nov 22 15:52:28 probook kernel: kasan_kmalloc+0xa0/0xd0 Nov 22 15:52:28 probook kernel: kmem_cache_alloc_trace+0xd1/0x1e0 Nov 22 15:52:28 probook kernel: dm_atomic_state_alloc+0x39/0x70 Nov 22 15:52:28 probook kernel: drm_atomic_helper_duplicate_state+0x6f/0x2a0 Nov 22 15:52:28 probook kernel: drm_atomic_helper_suspend+0x9e/0x130 Nov 22 15:52:28 probook kernel: dm_suspend+0x8c/0x130 Nov 22 15:52:28 probook kernel: amdgpu_suspend+0xf0/0x440 Nov 22 15:52:28 probook kernel: amdgpu_device_suspend+0x51f/0xa20 Nov 22 15:52:28 probook kernel: pci_pm_suspend+0x220/0x450 Nov 22 15:52:28 probook kernel: dpm_run_callback+0xcb/0x460 Nov 22 15:52:28 probook kernel: __device_suspend+0x2e4/0xd40 Nov 22 15:52:28 probook kernel: async_suspend+0x15/0xd0 Nov 22 15:52:28 probook kernel: async_run_entry_fn+0x16b/0x690 Nov 22 15:52:28 probook kernel: process_one_work+0x84b/0x1600 Nov 22 15:52:28 probook kernel: worker_thread+0x211/0x1790 Nov 22 15:52:28 probook kernel: kthread+0x2d4/0x390 Nov 22 15:52:28 probook kernel: ret_from_fork+0x1f/0x30 Nov 22 15:52:28 probook kernel: Nov 22 15:52:28 probook kernel: Freed by task 2619: Nov 22 15:52:28 probook kernel: kasan_slab_free+0x71/0xc0 Nov 22 15:52:28 probook kernel: kfree+0x88/0x1b0 Nov 22 15:52:28 probook kernel: drm_atomic_helper_resume+0xbf/0x120 Nov 22 15:52:28 probook kernel: amdgpu_dm_display_resume+0x6e9/0xa40 Nov 22 15:52:28 probook kernel: amdgpu_device_resume+0x7d3/0x910 Nov 22 15:52:28 probook kernel: dpm_run_callback+0xcb/0x460 Nov 22 15:52:28 probook kernel: device_resume+0x165/0x470 Nov 22 15:52:28 probook kernel: async_resume+0x14/0x40 Nov 22 15:52:28 probook kernel: async_run_entry_fn+0x16b/0x690 Nov 22 15:52:28 probook kernel: process_one_work+0x84b/0x1600 Nov 22 15:52:28 probook kernel: worker_thread+0x211/0x1790 Nov 22 15:52:28 probook kernel: kthread+0x2d4/0x390 Nov 22 15:52:28 probook kernel: ret_from_fork+0x1f/0x30 Nov 22 15:52:28 probook kernel: Nov 22 15:52:28 probook kernel: The buggy address belongs to the object at ffff8803e1be7840 which belongs to the cache kmalloc-128 of size 128 Nov 22 15:52:28 probook kernel: The buggy address is located 0 bytes inside of 128-byte region [ffff8803e1be7840, ffff8803e1be78c0) Nov 22 15:52:28 probook kernel: The buggy address belongs to the page: Nov 22 15:52:28 probook kernel: page:ffffea000f86f9c0 count:1 mapcount:0 mapping: (null) index:0xffff8803e1be7000 Nov 22 15:52:28 probook kernel: flags: 0x2000000000000100(slab) Nov 22 15:52:28 probook kernel: raw: 2000000000000100 0000000000000000 ffff8803e1be7000 0000000180150010 Nov 22 15:52:28 probook kernel: raw: 0000000000000000 0000000500000001 ffff8803f3403340 0000000000000000 Nov 22 15:52:28 probook kernel: page dumped because: kasan: bad access detected Nov 22 15:52:28 probook kernel: Nov 22 15:52:28 probook kernel: Memory state around the buggy address: Nov 22 15:52:28 probook kernel: ffff8803e1be7700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc Nov 22 15:52:28 probook kernel: ffff8803e1be7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Nov 22 15:52:28 probook kernel: >ffff8803e1be7800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb Nov 22 15:52:28 probook kernel: ^ Nov 22 15:52:28 probook kernel: ffff8803e1be7880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc Nov 22 15:52:28 probook kernel: ffff8803e1be7900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Nov 22 15:52:28 probook kernel: ================================================================== Nov 22 15:52:28 probook kernel: Disabling lock debugging due to kernel taint -- Regards, Johannes