On Mon 2017-01-23 10:39:27, Juergen Gross wrote: > On 13/01/17 15:41, Juergen Gross wrote: > > On 12/01/17 10:21, Chris Wilson wrote: > >> On Thu, Jan 12, 2017 at 07:03:25AM +0100, Juergen Gross wrote: > >>> On 11/01/17 18:08, Chris Wilson wrote: > >>>> On Wed, Jan 11, 2017 at 05:33:34PM +0100, Juergen Gross wrote: > >>>>> With kernel 4.10rc3 running as Xen dm0 I get at each boot: > >>>>> > >>>>> [ 49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell > >>>>> [1431], reason: Hang on render ring, action: reset > >>>>> [ 49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire > >>>>> gfx stack, including userspace. > >>>>> [ 49.213700] [drm] Please file a _new_ bug report on > >>>>> bugs.freedesktop.org against DRI -> DRM/Intel > >>>>> [ 49.213700] [drm] drm/i915 developers can then reassign to the right > >>>>> component if it's not a kernel issue. > >>>>> [ 49.213700] [drm] The gpu crash dump is required to analyze gpu > >>>>> hangs, so please always attach it. > >>>>> [ 49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error > >>>>> [ 49.213755] drm/i915: Resetting chip after gpu hang > >>>>> [ 60.213769] drm/i915: Resetting chip after gpu hang > >>>>> [ 71.189737] drm/i915: Resetting chip after gpu hang > >>>>> [ 82.165747] drm/i915: Resetting chip after gpu hang > >>>>> [ 93.205727] drm/i915: Resetting chip after gpu hang > >>>>> > >>>>> The dump is attached. > >>>> > >>>> That's a nasty one. The first couple of pages of the batchbuffer appear > >>>> to be overwritten. (Full of 0xc2c2c2c2, i.e. probably pixel data.) That > >>>> may be a concurrent write by either the GPU or CPU, or we may have > >>>> incorrected mapped a set of pages. That it doesn't recovered suggests > >>>> that the corruption occurs frequently, probably on every request/batch. > >>> > >>> I hoped someone would have an idea already. > >> > >> Sorry, first report of something like this in a long time (that I can > >> remember at least). And the problem is that it can be anything from a > >> coherency to a concurrency issue, so no one patch springs to mind. > >> Thankfully it appears to be kernel related. > >> -Chris > >> > > > > Bisecting took longer than I thought, but I had to cherry pick some > > patches and rebase one of them multiple times... > > > > Finally I found the commit to blame: 920cf4194954ec ("drm/i915: > > Introduce an internal allocator for disposable private objects") > > > > In case you need me to produce some more data or test a patch > > feel free to reach out. > > Anything new for this severe regression? > > Without a fix 4.10 will be unusable with Xen on a machine with i915 > graphics! Did this get solved? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel