Does this patch help by any chance? https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=5e01c09ce3b7263d88873105f21a82eda904664b Alex On Thu, Jan 3, 2019 at 7:14 AM Luís Mendes <luis.p.mendes@xxxxxxxxx> wrote: > > Hi Christian, Alex, > > I've set the kernel command line with drm.debug=0xf, and I see what > could be a race condition that triggers the failure, and from what I > see the critical path is quite after the ring tests. This happens on > ARM but maybe what is also affecting my TYAN S7002 and S7025, as the > failure symptom seems similar, except it is failing every time on the > TYANs. While on an AsRock Rack EP2C602 with Xeon E5 v2 it is working > fine. > > Below follow the two log excerpts, the first from a working > initialization attempt, and the second from a failed initialization > attempt. Both attemps were made with with kernel vanilla 4.20.0 on the > same armhf system. Full dmesg logs attached. Please ignored the EDID > errors, as I'm having a problem with this particular CROWN TV. The > EDID gets overwritten at every boot when connected to any Radeon RX > card that I have tried, while with Radeon R7 240 the EDID is not > corrupted on boot, but that's another story. > > Meanwhile I will try to find the concrete racing condition. It is > noticeable that for some reason the kernel thread > [drm:amdgpu_ih_process [amdgpu]] doesn't receive updates due to the > gpu hang and only one EOP irq is recevied on the bad boot attempt, > while on the good attempt 3 EOP irqs are triggered. > > Good attempt (critical log excerpt from kern_good.log): > Jan 3 11:28:03 picolo kernel: [ 39.845747] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16032, wptr 16048 > Jan 3 11:28:03 picolo kernel: [ 39.845987] > [drm:drm_calc_vbltimestamp_from_scanoutpos [drm]] crtc 0: Noisy > timestamp 26 us > 20 us [3 reps]. > Jan 3 11:28:03 picolo kernel: [ 39.850430] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:28:03 picolo kernel: [ 39.850489] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_CS > Jan 3 11:28:03 picolo kernel: [ 39.850697] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:28:03 picolo kernel: [ 39.850943] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16048, wptr 16080 > Jan 3 11:28:03 picolo kernel: [ 39.850973] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:28:03 picolo kernel: [ 39.851133] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.851159] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_CS > Jan 3 11:28:03 picolo kernel: [ 39.851333] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.851360] [drm:drm_ioctl [drm]] > pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:28:03 picolo kernel: [ 39.851513] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16080, wptr 16096 > Jan 3 11:28:03 picolo kernel: [ 39.851657] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.851810] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16096 > Jan 3 11:28:03 picolo kernel: [ 39.851950] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16128 > Jan 3 11:28:03 picolo kernel: [ 39.852091] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.852239] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.852265] [drm:drm_ioctl [drm]] > pid=605, dev=0xe200, auth=1, AMDGPU_WAIT_CS > Jan 3 11:28:03 picolo kernel: [ 39.852411] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16128 > Jan 3 11:28:03 picolo kernel: [ 39.852605] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16144 > Jan 3 11:28:03 picolo kernel: [ 39.852754] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.852905] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16144, wptr 16160 > Jan 3 11:28:03 picolo kernel: [ 39.853049] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:28:03 picolo kernel: [ 39.853210] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16160, wptr 16160 > Jan 3 11:28:03 picolo kernel: [ 39.853418] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16160, wptr 16176 > Jan 3 11:28:03 picolo kernel: [ 39.853582] [drm:gfx_v8_0_eop_irq > [amdgpu]] IH: CP EOP > Jan 3 11:28:03 picolo kernel: [ 39.853752] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16176, wptr 16208 > Jan 3 11:28:03 picolo kernel: [ 39.853901] [drm:gfx_v8_0_eop_irq > [amdgpu]] IH: CP EOP > Jan 3 11:28:03 picolo kernel: [ 39.854044] [drm:gfx_v8_0_eop_irq > [amdgpu]] IH: CP EOP > Jan 3 11:28:03 picolo kernel: [ 39.854205] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16208, wptr 16208 > Jan 3 11:28:03 picolo kernel: [ 39.857057] [drm:drm_ioctl [drm]] > pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC > Jan 3 11:28:03 picolo kernel: [ 39.857089] [drm:drm_mode_setcrtc > [drm]] [CRTC:45:crtc-1] > Jan 3 11:28:03 picolo kernel: [ 39.857341] > [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound > Jan 3 11:28:03 picolo kernel: [ 39.857508] > [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:1 > crtc_state_flags: enable:0, active:0, planes_changed:0, > mode_changed:0,active_changed:0,connectors_changed:0 > Jan 3 11:28:03 picolo kernel: [ 39.857559] [drm:drm_ioctl [drm]] > pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC > Jan 3 11:28:03 picolo kernel: [ 39.857587] [drm:drm_mode_setcrtc > [drm]] [CRTC:47:crtc-2] > Jan 3 11:28:03 picolo kernel: [ 39.857769] > [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound > Jan 3 11:28:03 picolo kernel: [ 39.857944] > [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:2 > crtc_state_flags: enable:0, active:0, planes_changed:0, > mode_changed:0,active_changed:0,connectors_changed:0 > Jan 3 11:28:03 picolo kernel: [ 39.857992] [drm:drm_ioctl [drm]] > pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC > Jan 3 11:28:03 picolo kernel: [ 39.858020] [drm:drm_mode_setcrtc > [drm]] [CRTC:49:crtc-3] > > BAD attempt (critical log excerpt from kern_bad.log): > Jan 3 11:39:23 picolo kernel: [ 39.599313] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14240, wptr 14256 > Jan 3 11:39:23 picolo kernel: [ 39.599496] > [drm:drm_calc_vbltimestamp_from_scanoutpos [drm]] crtc 0: Noisy > timestamp 26 us > 20 us [3 reps]. > Jan 3 11:39:23 picolo kernel: [ 39.599599] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:39:23 picolo kernel: [ 39.599640] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_CS > Jan 3 11:39:23 picolo kernel: [ 39.599992] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14256, wptr 14272 > Jan 3 11:39:23 picolo kernel: [ 39.600142] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.600297] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14272, wptr 14304 > Jan 3 11:39:23 picolo kernel: [ 39.600439] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.600580] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.600725] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14304, wptr 14304 > Jan 3 11:39:23 picolo kernel: [ 39.600795] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:39:23 picolo kernel: [ 39.600846] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:39:23 picolo kernel: [ 39.600881] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_CS > Jan 3 11:39:23 picolo kernel: [ 39.601019] [drm:drm_ioctl [drm]] > pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST > Jan 3 11:39:23 picolo kernel: [ 39.601074] [drm:drm_ioctl [drm]] > pid=630, dev=0xe200, auth=1, AMDGPU_WAIT_CS > Jan 3 11:39:23 picolo kernel: [ 39.601269] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14304, wptr 14320 > Jan 3 11:39:23 picolo kernel: [ 39.601416] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.601569] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14320, wptr 14384 > Jan 3 11:39:23 picolo kernel: [ 39.601595] [drm:drm_ioctl [drm]] > pid=630, dev=0xe200, auth=1, AMDGPU_WAIT_CS > Jan 3 11:39:23 picolo kernel: [ 39.601738] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.601880] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.602029] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.602171] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.602313] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14384, wptr 14384 > Jan 3 11:39:23 picolo kernel: [ 39.602500] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14384, wptr 14400 > Jan 3 11:39:23 picolo kernel: [ 39.602649] > [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap > Jan 3 11:39:23 picolo kernel: [ 39.602887] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14400, wptr 14416 > Jan 3 11:39:23 picolo kernel: [ 39.603054] [drm:gfx_v8_0_eop_irq > [amdgpu]] IH: CP EOP > Jan 3 11:39:23 picolo kernel: [ 39.615864] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14416, wptr 14432 > Jan 3 11:39:23 picolo kernel: [ 39.632542] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14432, wptr 14448 > Jan 3 11:39:23 picolo kernel: [ 39.649264] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14448, wptr 14464 > Jan 3 11:39:23 picolo kernel: [ 39.665943] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14464, wptr 14480 > Jan 3 11:39:23 picolo kernel: [ 39.682610] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14480, wptr 14496 > Jan 3 11:39:23 picolo kernel: [ 39.699285] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14496, wptr 14512 > Jan 3 11:39:23 picolo kernel: [ 39.715955] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14512, wptr 14528 > Jan 3 11:39:23 picolo kernel: [ 39.732629] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14528, wptr 14544 > Jan 3 11:39:23 picolo kernel: [ 39.749313] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14544, wptr 14560 > Jan 3 11:39:23 picolo kernel: [ 39.765995] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14560, wptr 14576 > Jan 3 11:39:23 picolo kernel: [ 39.782667] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14576, wptr 14592 > Jan 3 11:39:23 picolo kernel: [ 39.799363] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14592, wptr 14608 > Jan 3 11:39:23 picolo kernel: [ 39.816043] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14608, wptr 14624 > Jan 3 11:39:23 picolo kernel: [ 39.832734] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14624, wptr 14640 > Jan 3 11:39:23 picolo kernel: [ 39.849426] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14640, wptr 14656 > Jan 3 11:39:23 picolo kernel: [ 39.866081] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14656, wptr 14672 > Jan 3 11:39:23 picolo kernel: [ 39.882822] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14672, wptr 14688 > Jan 3 11:39:23 picolo kernel: [ 39.899455] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14688, wptr 14704 > Jan 3 11:39:23 picolo kernel: [ 39.916190] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14704, wptr 14720 > Jan 3 11:39:23 picolo kernel: [ 39.932885] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14720, wptr 14736 > Jan 3 11:39:23 picolo kernel: [ 39.949589] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14736, wptr 14752 > Jan 3 11:39:23 picolo kernel: [ 39.966238] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14752, wptr 14768 > Jan 3 11:39:23 picolo kernel: [ 39.982869] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14768, wptr 14784 > Jan 3 11:39:23 picolo kernel: [ 39.999609] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14784, wptr 14800 > Jan 3 11:39:23 picolo kernel: [ 40.016286] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14800, wptr 14816 > Jan 3 11:39:23 picolo kernel: [ 40.033045] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14816, wptr 14832 > Jan 3 11:39:23 picolo kernel: [ 40.049716] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14832, wptr 14848 > Jan 3 11:39:23 picolo kernel: [ 40.066446] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14848, wptr 14864 > Jan 3 11:39:23 picolo kernel: [ 40.083031] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14864, wptr 14880 > Jan 3 11:39:23 picolo kernel: [ 40.099765] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14880, wptr 14896 > Jan 3 11:39:23 picolo kernel: [ 40.116394] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14896, wptr 14912 > Jan 3 11:39:23 picolo kernel: [ 40.133133] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14912, wptr 14928 > Jan 3 11:39:23 picolo kernel: [ 40.149743] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14928, wptr 14944 > Jan 3 11:39:23 picolo kernel: [ 40.166426] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14944, wptr 14960 > Jan 3 11:39:23 picolo kernel: [ 40.183178] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14960, wptr 14976 > Jan 3 11:39:23 picolo kernel: [ 40.199788] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14976, wptr 14992 > Jan 3 11:39:23 picolo kernel: [ 40.216507] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 14992, wptr 15008 > Jan 3 11:39:23 picolo kernel: [ 40.233150] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15008, wptr 15024 > Jan 3 11:39:23 picolo kernel: [ 40.249815] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15024, wptr 15040 > Jan 3 11:39:23 picolo kernel: [ 40.266454] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15040, wptr 15056 > Jan 3 11:39:23 picolo kernel: [ 40.283123] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15056, wptr 15072 > Jan 3 11:39:23 picolo kernel: [ 40.299804] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15072, wptr 15088 > Jan 3 11:39:23 picolo kernel: [ 40.316483] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15088, wptr 15104 > Jan 3 11:39:23 picolo kernel: [ 40.333164] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15104, wptr 15120 > Jan 3 11:39:23 picolo kernel: [ 40.349843] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15120, wptr 15136 > Jan 3 11:39:23 picolo kernel: [ 40.366523] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15136, wptr 15152 > Jan 3 11:39:23 picolo kernel: [ 40.383200] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15152, wptr 15168 > Jan 3 11:39:23 picolo kernel: [ 40.399878] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15168, wptr 15184 > Jan 3 11:39:23 picolo kernel: [ 40.416561] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15184, wptr 15200 > Jan 3 11:39:23 picolo kernel: [ 40.433245] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15200, wptr 15216 > Jan 3 11:39:23 picolo kernel: [ 40.449925] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15216, wptr 15232 > Jan 3 11:39:23 picolo kernel: [ 40.466613] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15232, wptr 15248 > Jan 3 11:39:24 picolo kernel: [ 40.483291] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15248, wptr 15264 > Jan 3 11:39:24 picolo kernel: [ 40.499971] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15264, wptr 15280 > Jan 3 11:39:24 picolo kernel: [ 40.516652] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15280, wptr 15296 > Jan 3 11:39:24 picolo kernel: [ 40.533336] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15296, wptr 15312 > Jan 3 11:39:24 picolo kernel: [ 40.550016] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15312, wptr 15328 > Jan 3 11:39:24 picolo kernel: [ 40.566715] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15328, wptr 15344 > Jan 3 11:39:24 picolo kernel: [ 40.583390] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15344, wptr 15360 > Jan 3 11:39:24 picolo kernel: [ 40.600065] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15360, wptr 15376 > Jan 3 11:39:24 picolo kernel: [ 40.616745] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15376, wptr 15392 > Jan 3 11:39:24 picolo kernel: [ 40.633432] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15392, wptr 15408 > Jan 3 11:39:24 picolo kernel: [ 40.650113] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15408, wptr 15424 > Jan 3 11:39:24 picolo kernel: [ 40.666790] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15424, wptr 15440 > Jan 3 11:39:24 picolo kernel: [ 40.683477] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15440, wptr 15456 > Jan 3 11:39:24 picolo kernel: [ 40.700157] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15456, wptr 15472 > Jan 3 11:39:24 picolo kernel: [ 40.716836] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15472, wptr 15488 > Jan 3 11:39:24 picolo kernel: [ 40.733522] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15488, wptr 15504 > Jan 3 11:39:24 picolo kernel: [ 40.750203] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15504, wptr 15520 > Jan 3 11:39:24 picolo kernel: [ 40.766882] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15520, wptr 15536 > Jan 3 11:39:24 picolo kernel: [ 40.783563] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15536, wptr 15552 > Jan 3 11:39:24 picolo kernel: [ 40.800247] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15552, wptr 15568 > Jan 3 11:39:24 picolo kernel: [ 40.816929] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15568, wptr 15584 > Jan 3 11:39:24 picolo kernel: [ 40.833633] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15584, wptr 15600 > Jan 3 11:39:24 picolo kernel: [ 40.850305] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15600, wptr 15616 > Jan 3 11:39:24 picolo kernel: [ 40.867011] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15616, wptr 15632 > Jan 3 11:39:24 picolo kernel: [ 40.883676] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15632, wptr 15648 > Jan 3 11:39:24 picolo kernel: [ 40.900346] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15648, wptr 15664 > Jan 3 11:39:24 picolo kernel: [ 40.917026] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15664, wptr 15680 > Jan 3 11:39:24 picolo kernel: [ 40.933716] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15680, wptr 15696 > Jan 3 11:39:24 picolo kernel: [ 40.950390] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15696, wptr 15712 > Jan 3 11:39:24 picolo kernel: [ 40.967070] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15712, wptr 15728 > Jan 3 11:39:24 picolo kernel: [ 40.983757] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15728, wptr 15744 > Jan 3 11:39:24 picolo kernel: [ 41.000438] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15744, wptr 15760 > Jan 3 11:39:24 picolo kernel: [ 41.017115] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15760, wptr 15776 > Jan 3 11:39:24 picolo kernel: [ 41.033812] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15776, wptr 15792 > Jan 3 11:39:24 picolo kernel: [ 41.050485] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15792, wptr 15808 > Jan 3 11:39:24 picolo kernel: [ 41.067162] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15808, wptr 15824 > Jan 3 11:39:24 picolo kernel: [ 41.083845] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15824, wptr 15840 > Jan 3 11:39:24 picolo kernel: [ 41.100523] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15840, wptr 15856 > Jan 3 11:39:24 picolo kernel: [ 41.117205] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15856, wptr 15872 > Jan 3 11:39:24 picolo kernel: [ 41.133904] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15872, wptr 15888 > Jan 3 11:39:24 picolo kernel: [ 41.150579] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15888, wptr 15904 > Jan 3 11:39:24 picolo kernel: [ 41.167255] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15904, wptr 15920 > Jan 3 11:39:24 picolo kernel: [ 41.183933] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15920, wptr 15936 > Jan 3 11:39:24 picolo kernel: [ 41.200614] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15936, wptr 15952 > Jan 3 11:39:24 picolo kernel: [ 41.217295] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15952, wptr 15968 > Jan 3 11:39:24 picolo kernel: [ 41.233984] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15968, wptr 15984 > Jan 3 11:39:24 picolo kernel: [ 41.250663] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 15984, wptr 16000 > Jan 3 11:39:24 picolo kernel: [ 41.267347] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16000, wptr 16016 > Jan 3 11:39:24 picolo kernel: [ 41.284027] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16016, wptr 16032 > Jan 3 11:39:24 picolo kernel: [ 41.300706] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16032, wptr 16048 > Jan 3 11:39:24 picolo kernel: [ 41.317388] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16048, wptr 16064 > Jan 3 11:39:24 picolo kernel: [ 41.334071] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16064, wptr 16080 > Jan 3 11:39:24 picolo kernel: [ 41.350752] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16080, wptr 16096 > Jan 3 11:39:24 picolo kernel: [ 41.367442] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16112 > Jan 3 11:39:24 picolo kernel: [ 41.384122] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16112, wptr 16128 > Jan 3 11:39:24 picolo kernel: [ 41.400801] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16144 > Jan 3 11:39:24 picolo kernel: [ 41.417480] [drm:amdgpu_ih_process > [amdgpu]] amdgpu_ih_process: rptr 16144, wptr 16160 > Jan 3 11:39:24 picolo kernel: [ 41.432501] [drm:vblank_disable_fn > [drm]] disabling vblank on crtc 0 > Jan 3 11:41:22 picolo kernel: [ 49.762715] [drm:amdgpu_job_timedout > [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2, emitted seq=3 > Jan 3 11:41:22 picolo kernel: [ 49.772047] [drm] GPU recovery disabled. > > Regards, > Luís > > On Wed, Jan 2, 2019 at 12:05 PM Christian König > <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > > > Hi Luis, > > > > mhm, sounds like a timing issue. We have probably made something faster > > during bootup in 4.20 and because of this you now see this issue more often. > > > > If the bisection doesn't show any result can you try adding some > > msleep(10) call at critical places in the driver code to narrow this down? > > > > Officially we don't test/support ARM with the driver code, but in this > > particular case we should probably investigate since it sounds like it > > just doesn't happen on x86 because of different timing. > > > > Thanks, > > Christian. > > > > Am 28.12.18 um 15:05 schrieb Luís Mendes: > > > Hi Alex, > > > > > > Before all... Have a nice holidays! Happy new year!! > > > > > > - Okay, so it looks like sometimes the driver is able to enter > > > graphical mode with the Polaris card, but most of the time it fails > > > before with: > > > [ 49.762704] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx > > > timeout, signaled seq=2, emitted seq=3 > > > > > > - This is something that is happening sporadically but in a less > > > intensive way in 4.17, 4.18 and 4.19 kernels, so this is actually not > > > a regression, but rather an existent issue, which maybe the patch > > > "drm/amdgpu/gfx_v8_0: Reorder the gfx, kiq and kcq ring tests > > > sequence" solves. I tried to backport it to 4.20, but had no > > > improvement. Need to try with the git version, or rc1. > > > > > > - This hang happens after the console is displayed in the screen, but > > > before switching to graphical mode with X. > > > > > > - However if X is entered then the driver is stable and can be used > > > for long periods. > > > > > > Regards, > > > Luís Mendes > > > > > > On Tue, Dec 18, 2018 at 11:16 PM Luís Mendes <luis.p.mendes@xxxxxxxxx> wrote: > > >> Hi Alex, > > >> > > >> I am already using drm_arch_can_wc_memory() set to false. > > >> I will try to bisect... > > >> > > >> Regards, > > >> Luís > > >> > > >> On Tue, Dec 18, 2018 at 7:03 PM Alex Deucher <alexdeucher@xxxxxxxxx> wrote: > > >>> On Tue, Dec 18, 2018 at 8:58 AM Luís Mendes <luis.p.mendes@xxxxxxxxx> wrote: > > >>>> Hi Christian, > > >>>> > > >>>> I've been using a Sapphire RX 550 and a Sapphire RX 460 on a custom > > >>>> armhf board that runs well with Linux 4.19.9 at least, but now > > >>>> starting with Linux kernel 4.20, I'm having a gpu hang, right after > > >>>> the console being displayed, but before entering in graphical mode, > > >>>> when starting X session. > > >>>> I'm only reporting this now, because there was a PCI commit for mvebu > > >>>> that also entered for linux-4.20 that caused a kernel oops during > > >>>> pci_map_rom call in amdgpu initialization code. I've reverted that > > >>>> patch, but now amdgpu is hanging. > > >>> It would be useful if you could bisect. This is the first I've heard > > >>> of amdgpu working on an ARM board without write combining (WC) > > >>> disabled. You might check to see if disabling WC helps. Return false > > >>> in drm_arch_can_wc_memory(). > > >>> > > >>> Alex > > >>> > > >>>> > > >>>> [ 24.801861] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx > > >>>> timeout, signaled seq=2, emitted seq=3 > > >>>> > > >>>> 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > > >>>> [AMD/ATI] Baffin [Polaris11] (rev ff) (prog-if 00 [VGA controller]) > > >>>> Subsystem: Sapphire Technology Limited Baffin [Radeon RX 560] > > >>>> Flags: bus master, fast devsel, latency 0, IRQ 51 > > >>>> Memory at d0000000 (64-bit, prefetchable) [size=256M] > > >>>> Memory at e0000000 (64-bit, prefetchable) [size=2M] > > >>>> I/O ports at 10000 [size=256] > > >>>> Memory at e0200000 (32-bit, non-prefetchable) [size=256K] > > >>>> Expansion ROM at e0240000 [disabled] [size=128K] > > >>>> Capabilities: <access denied> > > >>>> Kernel driver in use: amdgpu > > >>>> Kernel modules: amdgpu > > >>>> > > >>>> dmesg follows in attachment. > > >>>> > > >>>> Regards, > > >>>> Luís > > >>>> _______________________________________________ > > >>>> amd-gfx mailing list > > >>>> amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > > > amd-gfx mailing list > > > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx