During execbuffer emission we assert that we do not wrap around the seqno used for semaphore breadcrumbs. However: :kernel BUG at drivers/gpu/drm/i915/i915_gem_execbuffer.c:1239! :invalid opcode: 0000 [#1] SMP :CPU 0 :Modules linked in: usb_storage usblp bnep lockd sunrpc bluetooth rfkill vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi snd_hda_codec_realtek coretemp mei lpc_ich snd_hda_intel mfd_core i2c_i801 kvm snd_hda_codec snd_hwdep e1000e microcode snd_pcm snd_page_alloc snd_timer snd soundcore serio_raw uinput binfmt_misc crc32c_intel ghash_clmulni_intel wmi i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] :Pid: 962, comm: X Tainted: G C O 3.5.4-1.fc17.x86_64 #1 LENOVO 5032AJ3/ :RIP: 0010:[<ffffffffa008842c>] [<ffffffffa008842c>] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] :RSP: 0018:ffff8801214a3c08 EFLAGS: 00010286 :RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000200200 :RDX: ffff880133561988 RSI: 00000000fffffe0c RDI: ffff88000248f8b0 :RBP: ffff8801214a3d28 R08: ffff88000248f8b0 R09: 0000000180400032 :R10: 000000003288eb01 R11: ffff8801214a3fd8 R12: ffff8801335618b0 :R13: ffff880133ebd800 R14: 0000000000000001 R15: ffff8801214a3bf8 :FS: 00007f5f0647a8c0(0000) GS:ffff88013e200000(0000) knlGS:0000000000000000 :CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 :CR2: 0000000003481000 CR3: 0000000134170000 CR4: 00000000000407f0 :DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 :DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 :Process X (pid: 962, threadinfo ffff8801214a2000, task ffff880123afae20) :Stack: : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : ffff880123851cc0 0000000000000001 ffff8801214a3c88 ffffffffa0082eb4 : ffff880100000002 000000c033560000 ffff8801214a3c88 ffff880123851ce0 :Call Trace: : [<ffffffffa0082eb4>] ? i915_gem_object_set_to_gtt_domain+0x104/0x1a0 [i915] : [<ffffffffa0089031>] i915_gem_execbuffer2+0xb1/0x290 [i915] : [<ffffffffa00154f3>] drm_ioctl+0x4d3/0x580 [drm] : [<ffffffffa0088f80>] ? i915_gem_execbuffer+0x480/0x480 [i915] : [<ffffffff81188d5e>] ? do_readv_writev+0x18e/0x1e0 : [<ffffffff81199919>] do_vfs_ioctl+0x99/0x580 : [<ffffffff8127973a>] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30 : [<ffffffff8127ad17>] ? file_has_perm+0x97/0xb0 : [<ffffffff81199e99>] sys_ioctl+0x99/0xa0 : [<ffffffff81614e29>] system_call_fastpath+0x16/0x1b :Code: 59 fd ff ff 4c 89 ef e8 23 a4 ff ff 85 c0 90 0f 85 ad fc ff ff 4c 89 ef e8 82 9d ff ff 45 8b 4c 24 6c 45 85 c9 0f 84 02 fd ff ff <0f> 0b 4c 89 ef e8 fa a3 ff ff 85 c0 0f 85 85 fc ff ff 4c 89 ef :RIP [<ffffffffa008842c>] i915_gem_do_execbuffer.isra.10+0xcbc/0x1390 [i915] : RSP <ffff8801214a3c08> clearly shows us hitting this supposedly impossible wraparound. The cause here is that after idling, retire-requests only resets the breadcrumbs if there was a request on the ring. To avoid this after idling, we can simply clear the breadcrumbs. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> Cc: stable at vger.kernel.org --- drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 288d7b8..95c0cd0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2163,6 +2163,15 @@ static int i915_ring_idle(struct intel_ring_buffer *ring) return i915_wait_request(ring, i915_gem_next_request_seqno(ring)); } +static void i915_ring_reset_seqno(struct intel_ring_buffer *ring) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(ring->sync_seqno); i++) + if (seqno >= ring->sync_seqno[i]) + ring->sync_seqno[i] = 0; +} + int i915_gpu_idle(struct drm_device *dev) { drm_i915_private_t *dev_priv = dev->dev_private; @@ -2178,6 +2187,8 @@ int i915_gpu_idle(struct drm_device *dev) /* Is the device fubar? */ if (WARN_ON(!list_empty(&ring->gpu_write_list))) return -EBUSY; + + i915_ring_reset_seqno(ring); } return 0; -- 1.7.10.4