To narrow things down it's likely something in the CZ code paths as it still crashes with the Polaris10 removed. Tom On 25/09/17 01:55 PM, Tom St Denis wrote: > This change > > commit f96306921d5e346ebc82c7c51ae6e0b736e5b425 > Author: Rex Zhu <Rex.Zhu at amd.com> > Date:  Wed Sep 20 14:44:55 2017 +0800 > >    drm/amd/powerplay: refine powerplay code. > >    delete struct smumgr, put smu backend function table >    in struct hwmgr > >    Change-Id: I7b73ef062b147b4e7199105a3c101f6c8038cc57 >    Reviewed-by: Alex Deucher <alexander.deucher at amd.com> >    Signed-off-by: Rex Zhu <Rex.Zhu at amd.com> > > > Results in this dmesg log error messages on my Carrizo + Polaris10 setup: > > [  24.237785] [drm] amdgpu kernel modesetting enabled. > [  24.237814] checking generic (c0000000 7e9000) vs hw (e0000000 10000000) > [  24.237864] amdgpu 0000:00:01.0: enabling device (0006 -> 0007) > [  24.238366] [drm] initializing kernel modesetting (CARRIZO > 0x1002:0x9874 0x1002:0x1E10 0xE1). > [  24.238394] [drm] register mmio base: 0xD1300000 > [  24.238394] [drm] register mmio size: 262144 > [  24.238463] ACPI Error: [\_SB_.ALIB] Namespace lookup failure, > AE_NOT_FOUND (20170531/psargs-364) > [  24.238497] ACPI Error: Method parse/execution failed > \_SB.PCI0.VGA.ATC0, AE_NOT_FOUND (20170531/psparse-550) > [  24.238528] ACPI Error: Method parse/execution failed > \_SB.PCI0.VGA.ATCS, AE_NOT_FOUND (20170531/psparse-550) > [  24.238558] [drm] UVD is enabled in physical mode > [  24.238561] [drm] VCE enabled in physical mode > [  24.250365] ATOM BIOS: 109-C95010-001 > [  24.250381] [drm] GPU post is not needed > [  24.250407] [drm] vm size is 64 GB, block size is 13-bit, fragment > size is 9-bit > [  24.250412] amdgpu 0000:00:01.0: VRAM: 512M 0x000000F400000000 - > 0x000000F41FFFFFFF (512M used) > [  24.250413] amdgpu 0000:00:01.0: GTT: 1024M 0x0000000000000000 - > 0x000000003FFFFFFF > [  24.250420] [drm] Detected VRAM RAM=512M, BAR=512M > [  24.250421] [drm] RAM width 64bits UNKNOWN > [  24.250795] [TTM] Zone kernel: Available graphics memory: 3846244 kiB > [  24.250797] [TTM] Zone  dma32: Available graphics memory: 2097152 kiB > [  24.250797] [TTM] Initializing pool allocator > [  24.250801] [TTM] Initializing DMA pool allocator > [  24.250844] [drm] amdgpu: 512M of VRAM memory ready > [  24.250845] [drm] amdgpu: 3072M of GTT memory ready. > [  24.250860] [drm] GART: num cpu pages 262144, num gpu pages 262144 > [  24.250970] [drm] PCIE GART of 1024M enabled (table at > 0x000000F400040000). > [  24.251017] amdgpu 0000:00:01.0: amdgpu: using MSI. > [  24.251034] [drm] amdgpu: irq initialized. > [  24.251037] amdgpu: [powerplay] amdgpu: powerplay sw initialized > [  24.254140] [drm] Chained IB support enabled! > [  24.257056] amdgpu 0000:00:01.0: fence driver on ring 0 use gpu addr > 0x0000000000400080, cpu addr 0xffffc9000105d080 > [  24.257196] amdgpu 0000:00:01.0: fence driver on ring 1 use gpu addr > 0x0000000000400100, cpu addr 0xffffc9000105d100 > [  24.257922] amdgpu 0000:00:01.0: fence driver on ring 2 use gpu addr > 0x0000000000400180, cpu addr 0xffffc9000105d180 > [  24.258053] amdgpu 0000:00:01.0: fence driver on ring 3 use gpu addr > 0x0000000000400200, cpu addr 0xffffc9000105d200 > [  24.258115] amdgpu 0000:00:01.0: fence driver on ring 4 use gpu addr > 0x0000000000400280, cpu addr 0xffffc9000105d280 > [  24.258146] amdgpu 0000:00:01.0: fence driver on ring 5 use gpu addr > 0x0000000000400300, cpu addr 0xffffc9000105d300 > [  24.258353] amdgpu 0000:00:01.0: fence driver on ring 6 use gpu addr > 0x0000000000400380, cpu addr 0xffffc9000105d380 > [  24.258426] amdgpu 0000:00:01.0: fence driver on ring 7 use gpu addr > 0x0000000000400400, cpu addr 0xffffc9000105d400 > [  24.258484] amdgpu 0000:00:01.0: fence driver on ring 8 use gpu addr > 0x0000000000400480, cpu addr 0xffffc9000105d480 > [  24.258528] amdgpu 0000:00:01.0: fence driver on ring 9 use gpu addr > 0x0000000000400520, cpu addr 0xffffc9000105d520 > [  24.260159] amdgpu 0000:00:01.0: fence driver on ring 10 use gpu addr > 0x00000000004005a0, cpu addr 0xffffc9000105d5a0 > [  24.260508] amdgpu 0000:00:01.0: fence driver on ring 11 use gpu addr > 0x0000000000400620, cpu addr 0xffffc9000105d620 > [  24.261591] [drm] Found UVD firmware Version: 1.91 Family ID: 11 > [  24.262451] amdgpu 0000:00:01.0: fence driver on ring 12 use gpu addr > 0x000000f400296560, cpu addr 0xffffc90003442560 > [  24.263350] [drm] Found VCE firmware Version: 52.4 Binary ID: 3 > [  24.263819] amdgpu 0000:00:01.0: fence driver on ring 13 use gpu addr > 0x0000000000400720, cpu addr 0xffffc9000105d720 > [  24.263921] amdgpu 0000:00:01.0: fence driver on ring 14 use gpu addr > 0x00000000004007a0, cpu addr 0xffffc9000105d7a0 > [  24.264438] amdgpu: [powerplay] Fail to get clock table from SMU! > [  24.264440] amdgpu: [powerplay] amdgpu: powerplay initialization failed > [  24.264467] [drm] DAL is enabled > [  24.264835] [drm] DC: create_links: connectors_num: physical:3, > virtual:0 > [  24.264839] [drm] Connector[0] description:signal 32 > [  24.264842] [drm] Using channel: CHANNEL_ID_DDC1 [1] > [  24.264851] [drm] Connector[1] description:signal 4 > [  24.264853] [drm] Using channel: CHANNEL_ID_DDC2 [2] > [  24.264860] [drm] Connector[2] description:signal 4 > [  24.264862] [drm] Using channel: CHANNEL_ID_DDC3 [3] > [  24.564284] [drm:hwss_wait_for_blank_complete [amdgpu]] *ERROR* DC: > failed to blank crtc! > [  24.564329] [drm] Display Core initialized > [  24.564332] [drm] amdgpu: freesync_module init done ffff88021048afe0. > [  24.564564] [drm] link=0, dc_sink_in=         (null) is now > Disconnected > [  24.564565] [drm] DCHPD: connector_id=0: dc_sink didn't change. > [  24.564624] [drm] link=1, dc_sink_in=         (null) is now > Disconnected > [  24.564624] [drm] DCHPD: connector_id=1: dc_sink didn't change. > [  24.564738] [drm] link=2, dc_sink_in=         (null) is now > Disconnected > [  24.564739] [drm] DCHPD: connector_id=2: dc_sink didn't change. > [  24.564751] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [  24.564752] [drm] Driver supports precise vblank timestamp query. > [  24.564752] [drm] KMS initialized. > [  24.566110] [drm] ring test on 0 succeeded in 13 usecs > [  24.755765] [drm:gfx_v8_0_kiq_resume [amdgpu]] *ERROR* KCQ enable > failed (scratch(0xC040)=0xCAFEDEAD) > [  24.755819] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP > block <gfx_v8_0> failed -22 > [  24.755839] amdgpu 0000:00:01.0: amdgpu_init failed > [  24.756271] BUG: unable to handle kernel NULL pointer dereference at >         (null) > [  24.756302] IP:          (null) > [  24.756312] PGD 2134b3067 > [  24.756312] P4D 2134b3067 > [  24.756320] PUD 0 > > [  24.756340] Oops: 0010 [#1] SMP > [  24.756349] Modules linked in: amdgpu(+) chash ttm ax88179_178a > usbnet xhci_pci xhci_hcd efivarfs > [  24.756380] CPU: 3 PID: 3021 Comm: modprobe Not tainted 4.13.0-rc5+ #33 > [  24.756396] Hardware name: AMD Myrtle/Myrtle, BIOS TMY1100A 03/23/2016 > [  24.756413] task: ffff8802132744c0 task.stack: ffffc90000fd0000 > [  24.756427] RIP: 0010:         (null) > [  24.756437] RSP: 0018:ffffc90000fd3908 EFLAGS: 00010202 > [  24.756450] RAX: ffff88021048a460 RBX: ffff8802100258a0 RCX: > 000000018020000d > [  24.756466] RDX: 000000018020000e RSI: 0000000000005c02 RDI: > ffff88021048a5a0 > [  24.756482] RBP: ffffc90000fd3928 R08: ffff880210f9e580 R09: > 000000018020000d > [  24.756499] R10: ffffc90000fd3948 R11: ffffea0008525e00 R12: > 0000000000005c02 > [  24.756516] R13: ffff88021365b690 R14: ffff880211db0040 R15: > ffff880211db2f30 > [  24.756534] FS: 00007ffa8be38700(0000) GS:ffff88021ed80000(0000) > knlGS:0000000000000000 > [  24.756554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [  24.756569] CR2: 0000000000000000 CR3: 0000000210030000 CR4: > 00000000001406e0 > [  24.756586] Call Trace: > [  24.756745] ? destroy+0x31/0x100 [amdgpu] > [  24.756822] dal_i2caux_destruct+0x5d/0x90 [amdgpu] > [  24.756875] destroy+0x15/0x30 [amdgpu] > [  24.756925] dal_i2caux_destroy+0x1b/0x30 [amdgpu] > [  24.756977] destruct+0x90/0x140 [amdgpu] > [  24.757028] dc_destroy+0x10/0x30 [amdgpu] > [  24.757083] amdgpu_dm_fini+0x62/0x70 [amdgpu] > [  24.757137] dm_hw_fini+0x1d/0x30 [amdgpu] > [  24.757183] amdgpu_fini+0xe8/0x330 [amdgpu] > [  24.757229] amdgpu_device_init+0xe5a/0x1560 [amdgpu] > [  24.757245] ? kmalloc_order_trace+0x29/0xd0 > [  24.757290] ? amdgpu_driver_load_kms+0x53/0x200 [amdgpu] > [  24.757338] amdgpu_driver_load_kms+0x78/0x200 [amdgpu] > [  24.757353] drm_dev_register+0x141/0x1d0 > [  24.757393] amdgpu_pci_probe+0x113/0x140 [amdgpu] > [  24.757406] local_pci_probe+0x40/0xa0 > [  24.757416] pci_device_probe+0xaa/0x130 > [  24.757426] driver_probe_device+0x23e/0x2d0 > [  24.757437] __driver_attach+0x96/0xa0 > [  24.757446] ? driver_probe_device+0x2d0/0x2d0 > [  24.757457] bus_for_each_dev+0x5b/0x90 > [  24.757467] driver_attach+0x19/0x20 > [  24.757476] bus_add_driver+0x11c/0x220 > [  24.757485] driver_register+0x5b/0xd0 > [  24.757495] __pci_register_driver+0x47/0x50 > [  24.757532] amdgpu_init+0x88/0x9b [amdgpu] > [  24.757544] ? 0xffffffffa030a000 > [  24.757554] do_one_initcall+0x3e/0x160 > [  24.757566] ? __vunmap+0x7c/0xb0 > [  24.757577] ? kfree+0x147/0x160 > [  24.757587] ? kmem_cache_alloc_trace+0x33/0x150 > [  24.757602] do_init_module+0x5a/0x1f1 > [  24.757614] load_module+0x2329/0x28d0 > [  24.758259] ? kernel_read_file+0x19e/0x1c0 > [  24.758898] SYSC_finit_module+0xba/0xc0 > [  24.759524] ? SYSC_finit_module+0xba/0xc0 > [  24.760206] SyS_finit_module+0x9/0x10 > [  24.760835] entry_SYSCALL_64_fastpath+0x13/0x94 > [  24.761450] RIP: 0033:0x7ffa8b310219 > [  24.762137] RSP: 002b:00007ffe64b86b18 EFLAGS: 00000246 ORIG_RAX: > 0000000000000139 > [  24.762851] RAX: ffffffffffffffda RBX: 00000055ee325090 RCX: > 00007ffa8b310219 > [  24.763487] RDX: 0000000000000000 RSI: 00000055edf2d2a6 RDI: > 0000000000000005 > [  24.764116] RBP: 00000055ee326f50 R08: 0000000000000000 R09: > 0000000000000000 > [  24.764716] R10: 0000000000000005 R11: 0000000000000246 R12: > 00000055ee3252f0 > [  24.765298] R13: 00007ffe64b86ad8 R14: 00007ffe64b86ae0 R15: > 0000000000000000 > [  24.765878] Code: Bad RIP value. > [  24.766464] RIP:          (null) RSP: ffffc90000fd3908 > [  24.767036] CR2: 0000000000000000 > [  24.767717] ---[ end trace 636f871b29b747e7 ]--- > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx