On Tue, May 21, 2013 at 1:28 PM, Ben Guthro <ben at guthro.net> wrote: > On Tue, May 21, 2013 at 10:02 AM, Daniel Vetter <daniel at ffwll.ch> wrote: >> On Tue, May 21, 2013 at 3:44 PM, Ben Guthro <ben at guthro.net> wrote: >>>> This will break kms since now you have the vbios and the linux kms driver >>>> fighting over the same piece of hw. Does >>>> >>>> xset dpms force off >>>> xset dpms force on >>>> >>>> cause similar issues? >>> >>> No, these work as expected (on 3.8) >>> I didn't realize that these broke with KMS. I'll stick with the S3 reproduction. >> >> Ok, so things are at least not terribly broken. >> >>>> If not please make sure that vbetool isn't badly interfering with the >>>> kernel modeset driver on suspend/resume. At least looking at your dmesg >>>> and reg dumps vbe wreaking havoc with the kms driver seems like a rather >>>> likely scenario. Also, can you please test latest 3.10-rc kernels? >>> >>> 3.10-rc2 doesn't seem to work at all - it boots to a black screen every time. >> >> That otoh is ugly. Could be that though that this is the same (or a >> similar bug) to your resume issue - in the last few kernel releases >> we've tried very hard to unify the code between initial driver load at >> boot-up and resume. > > Perhaps I should qualify "at all" > > It seems that it fails somewhat late in the boot process. If I remove > the "boot splash" cli params, I can see it transition into the high > res mode, and seemingly get into init. > However, even if I boot to single user mode, the screen goes black. > > Unfortunately, both times I tried to test this, and then reboot, I > ended up at a "grub rescue" prompt, with an unusable system. > >> >> So can you please try to bisect where the boot-up regression has been >> introduced between 3.8 and 3.10-rc2? > > I'm not sure I'll be able to do this. > With the failure condition I describe above, I am unable to even ssh > into this machine to debug, nevermind install a new kernel. > This means I need to generate a new kernel, and install kit with that > kernel for every bisection test. > > This may be more time than I am able to dedicate to this problem - but I'll try. > > Ben It appears I did not CC the list on my last 2 replies. My apologies - I'll re-paste them below. I tried to bisect this, but was unsuccessful, in that I didn't seem to have a reproducible test case to get back into this failure condition. It seemed that it always would succeed for me...which of course makes bisecting near impossible. I tried updating to 3.10-RC3...well, actually to this changeset at the tip of Linus' tree: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=58f8bbd2e39c3732c55698494338ee19a92c53a0 I can get X to come up now on this machine - albeit very slowly. Once it comes up, it seems to hang, and respawn I get a lot of these in the log now, as well: [ 392.195734] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung Things in the log that look suspicious to me are: [ 34.293452] [drm:intel_pipe_set_base] *ERROR* pin & fence failed [ 34.293486] [drm:intel_crtc_set_config] *ERROR* failed to set mode on [CRTC:3], err = -28 I get the following errors in the X log, that prevent it from coming up: [ 76.142] (EE) intel(0): failed to set mode: No space left on device [ 76.142] Fatal server error: [ 76.142] AddScreen/ScreenInit failed for driver 0 [ 76.142] [ 76.142] (EE) Xorg also crashes in the following manner: [ 218.876] (EE) Backtrace: [ 218.880] (EE) 0: X (xorg_backtrace+0x34) [0x7fe44fff9754] [ 218.880] (EE) 1: X (0x7fe44fe44000+0x1b96a9) [0x7fe44fffd6a9] [ 218.880] (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7fe44f16a000+0xfcb0) [0x7fe44f179cb0] [ 218.880] (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7fe44ddcf000+0x148c6b) [0x7fe44df17c6b] [ 218.880] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0x17c36) [0x7fe44cb71c36] [ 218.880] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0x19857) [0x7fe44cb73857] [ 218.880] (EE) 6: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe44cb5a000+0xed429) [0x7fe44cc47429] [ 218.880] (EE) 7: X (0x7fe44fe44000+0x13e8ac) [0x7fe44ff828ac] [ 218.880] (EE) 8: X (0x7fe44fe44000+0x5239e) [0x7fe44fe9639e] [ 218.880] (EE) 9: X (0x7fe44fe44000+0x557a1) [0x7fe44fe997a1] [ 218.880] (EE) 10: X (0x7fe44fe44000+0x4415a) [0x7fe44fe8815a] [ 218.880] (EE) 11: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xed) [0x7fe44ddf076d] [ 218.880] (EE) 12: X (0x7fe44fe44000+0x444b1) [0x7fe44fe884b1] [ 218.880] (EE) [ 218.880] (EE) Bus error at address 0x7fe44a6c9080 [ 218.880] Fatal server error: [ 218.881] Caught signal 7 (Bus error). Server aborting [ 218.881] [ 218.881] (EE) I recognize that this isn't terribly helpful without the symbol resolution. I tried installing debug symbols, but they didn't seem to help. Ben