On Fri, Apr 29, 2016 at 01:25:30PM -0400, John S Gruber wrote: > Starting with linux 4.6.0-rc3 my Ubuntu Wily system no longer allows logons from > due to an immediate abort in xserver after just after entering my > userid and password. (lightdm drew the sign on screen OK). > > The xserver problem seems to result from a null reference from > __kgem_retire_rq from package xserver-xorg-video-intel version > 2:2.99.917+git20150808-0ubuntu4. > > Bisecting the kernel I found that this was triggered by commit > 426960bed3217f72a1b7bb94f084d79cc616ec0f. Reverting this commit based on > 4.6-rc5 eliminated my crash. > > The problem was specific to my HP Pavilion laptop with Intel HD 5500 > integrated graphics . A desktop Acer, also using Intel graphics, was > fine. On the laptop it was completely consistent. > > The laptop has: > > 00:02.0 VGA compatible controller: Intel Corporation Broadwell-U > Integrated Graphics (rev 09) (prog-if 00 [VGA controller]) > DeviceName: Intel(R) Graphics GT2 > > Testing the laptop with Ubuntu xenial (with xserver-xorg-video-intel > version 2:2.99.917+git20160325-1ubuntu1) was fine, however. > > Please let me know if this is problematic, and if so, if I should provide > additional information. I don't follow the list. > > ---------------------- > > The triggering commit: > > drm/i915: Seal busy-ioctl uABI and prevent leaking of internal ids The seeds of that crash were already sown. The error is that on a batch buffer allocation failure, the preallocated failsafe ended up on the request list (which is not supposed to happen and so it runs off the end of the list). commit 69d8edc11173df021aa2e158b2530257113141fd Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Fri Aug 7 10:08:17 2015 +0100 sna: Handle batch allocation failure Whilst we currently do not try and submit a failed batch buffer allocation, we still treat it as a valid request. This explodes much later when we inspect the NULL rq->bo. References: https://bugs.freedesktop.org/show_bug.cgi?id=91577 Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> is the cause of the crash, but commit 2d26643cab33a32847afaf13b50d326d09d58bf7 Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Fri Nov 13 19:03:36 2015 +0000 sna/dri2: Drop the reference on the fence when complete Fixes regression from commit 8d9e496670f48b4eec64dfe1bcedb49793cf3073 Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Date: Wed Jul 22 11:14:01 2015 +0100 sna/dri2: Take over the placeholder vblank After noting the fence was complete, we would clear it. But I forgot that we actually held a reference on to it, and so we would leak the 64k batch, and starve the system of available memory in about 18 minutes of SwapBuffers. Reported-by: Arkadiusz Miskiewicz <arekm@xxxxxxxx> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92911 Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> is where the bug began. The kernel just made it easier to hit the pre-existing bugs in userspace. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx