Hi Guys, Thanks for the help in advanced! I'm encountering a GPU hang issue while running multiple channel H264 video decoding + VPP composition, display and also one channel H264 encoding on BSW. It's a render ring stuck like below: [58503.223700] [drm] stuck on render ring [58503.246340] [drm] GPU HANG: ecode 8:0:0x7f1d7e3d, in Challenge [3259], reason: Ring hung, action: reset There is a part of the /sys/class/drm/card0/error as below, I suspect the hang is caused by the incorrect render ring buffer content: In below line with 'where I suspect', the value of ring buffer is 18800001 (MI_BATCH_BUFFER_START_GEN8), but the next DWORD is 00100002. Since MI_BATCH_BUFFER_START_GEN8 should be followed by batch buffer address, I think the content of ring buffer is not correct. ==========part of the /sys/class/drm/card0/error========= render ring --- 3 requests seqno 0x020dc83a, emitted 4353167966, tail 0x00000070 seqno 0x020dc83b, emitted 4353167969, tail 0x000000f0 seqno 0x020dc83e, emitted 4353167982, tail 0x00000170 render ring --- ringbuffer = 0x00015000 00000000 : 18800001 // where I suspect 00000004 : 00100002 // where I suspect 00000008 : 00000000 0000000c : 00000000 00000010 : 00000000 00000014 : 00000000 00000018 : 7a000004 0000001c : 01144c1c 00000020 : 00036080 00000024 : 00000000 00000028 : 00000000 0000002c : 00000000 00000030 : 04000000 00000034 : 00000000 00000038 : 0c000000 0000003c : 1382c10c ==========part of the /sys/class/drm/card0/error========= To identify when the ring buffer is incorrectly programmed, I added some code to read the first DWORD of ring buffer back after intel_ring_emit in gen8_emit_pipe_control while tail of ring buffer is zero. The result is: the read-back first DWORD of ring buffer is sometimes different from the data intel_ring_emit just writes when tail is 0. And just after this, GPU hang may happen. Here is the output of my print: [ 3409.067402] rcs b:0x18800001 d:0x7a000004 t:0 'b' - ioread32 (ringbuf->virtual_start) 'd' - intel_ring_emit wants to write 't' - the value of tail I'm aware that ringbuf->virtual_start is write combine, the read may led to write-combine buffer flush and slow read performance. But don't know why it's different from the value intel_ring_emit just writes? Also have another question, after CPU write to the WC ring buffer, how is WC buffer flushed before GPU start to read ring buffer? Thanks a lot! -James _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx