Regression in i915 for 4.11-rc1 - bisected to commit 69df05e11ab8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Since kernel 4.11-rc1, my desktop (Plasma5/KDE) has encountered intermittent hangs with the following information in the logs:

linux-4v1g.suse kernel: [drm] GPU HANG: ecode 7:0:0xf3cffffe, in plasmashell [1283], reason: Hang on render ring, action: reset linux-4v1g.suse kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. linux-4v1g.suse kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel linux-4v1g.suse kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. linux-4v1g.suse kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
linux-4v1g.suse kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
linux-4v1g.suse kernel: drm/i915: Resetting chip after gpu hang

This problem was added to https://bugs.freedesktop.org/show_bug.cgi?id=99380, but it probably is a different bug, as the OP in that report has problems with kernel 4.10.x, whereas my problem did not appear until 4.11.

The problem was bisected to commit 69df05e11ab8 ("drm/i915: Simplify releasing context reference"). The accuracy of the bisection was tested by reverting that patch in kernel 4.11-rc3. With that change, my kernel has now run for over 17 hours with no problem. Before the reversion, the longest any affected kernel would run was ~3 hours until a gpu hang was detected.

I admit that I do not understand this driver, but my guess is that this commit introduced a race condition in the context put.

Thanks,
Larry

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux