On Tue, Oct 28, 2014 at 05:06:01PM +0200, Jani Nikula wrote: > On Tue, 28 Oct 2014, Johan Hovold <johan@xxxxxxxxxx> wrote: > > Hi, > > > > I have had some problems with crashes involving suspend-to-disk after > > updating to v3.16. > > > > Below is a log with 3.16.6 from a failed suspend attempt after which I > > get a NULL deref in ext4 code. > > > > A couple of weeks ago I got something similar, with backtraces from > > ext4 (ext4_alloc_inode) and NULL-derefs in vfs (vfs_get_attr_nosec) when > > trying to do IO after resuming from suspend. That was with 3.16.3 and I > > was hoping that whatever it was would have been fixed in 3.16.6 (there > > were some ext4 error handling patches in there). I only got photos of > > those oopses but it involved kmem_cache_alloc (slub) and a NULL-deref in > > vfs_get_attr_nosec. I can put the photos up somewhere. That time I also > > got back to X and could issue a dmesg in an xterm, but any process trying > > to do IO died. > > > > Something similar happened with 3.16.1 but unfortunately I do not have > > any logs from that. > > > > I also have experienced occasional hangs during suspend, but I believe I > > have seen this with older kernels as well so not sure if related. Seems > > to be more frequent with 3.16. > > > > This is my main machine so not keen on trying to bisect this on it. > > > > It's an i7-4770 on an Intel DH87MC using the integrated HD Graphics 4600. > > > > I'm CCing the Intel graphics guys due to some errors drm errors in the > > logs, and reports of other people having problems involving suspend and > > this driver. > > My first suggestion would be to try to reproduce the NULL deref without > i915 loaded, and track the issues you have independently. I actually don't think this is i915 related, the new drm errors after failed suspend could possibly just be a side effect of whatever is causing the apparent memory corruption. As I mentioned, the first log I have of this do not seem to point at i915 (even if backlight-restore happens when tasks are restarted). > Please file any i915 issues against DRM/Intel at [1]. I'll see if I can get around to that. There are bug reports in various distro tracker about the intel_ddi_pll_enable warning dating back to April. It's there on every resume. For instance this morning: [108109.324398] WARNING: CPU: 1 PID: 7298 at /home/johan/src/linux/linux-xi/drivers/gpu/drm/i915/intel_ddi.c:911 intel_ddi_pll_enable+0x233/0x240() [108109.324398] WRPLL1 already enabled [108109.324399] Modules linked in: [108109.324400] CPU: 1 PID: 7298 Comm: kworker/u16:8 Tainted: G W 3.16.6 #1 [108109.324401] Hardware name: /DH87MC, BIOS MCH8710H.86A.0154.2014.0123.1542 01/23/2014 [108109.324403] Workqueue: events_unbound async_run_entry_fn [108109.324405] 0000000000000000 0000000000000009 ffffffff81739c03 ffff88053e89baf8 [108109.324405] ffffffff810850f6 ffff8807fadf0000 00000000b035061f 0000000000000001 [108109.324406] 0000000000046040 ffffffff81a10a41 ffffffff810851d5 ffffffff81a10a83 [108109.324407] Call Trace: [108109.324410] [<ffffffff81739c03>] ? dump_stack+0x49/0x6a [108109.324412] [<ffffffff810850f6>] ? warn_slowpath_common+0x86/0xb0 [108109.324414] [<ffffffff810851d5>] ? warn_slowpath_fmt+0x45/0x50 [108109.324415] [<ffffffff814445c3>] ? intel_ddi_pll_enable+0x233/0x240 [108109.324417] [<ffffffff814208ea>] ? haswell_crtc_mode_set+0x1a/0x30 [108109.324419] [<ffffffff8142e168>] ? __intel_set_mode+0x6a8/0x1590 [108109.324420] [<ffffffff814335f7>] ? intel_modeset_setup_hw_state+0x817/0xd10 [108109.324422] [<ffffffff813d4ae9>] ? drm_modeset_lock_all_crtcs+0x39/0x50 [108109.324424] [<ffffffff81328570>] ? pci_pm_suspend_noirq+0x1b0/0x1b0 [108109.324426] [<ffffffff813d719e>] ? __i915_drm_thaw+0x11e/0x1a0 [108109.324426] [<ffffffff813d786f>] ? i915_resume+0x1f/0x40 [108109.324428] [<ffffffff814749ef>] ? dpm_run_callback+0x4f/0x150 [108109.324428] [<ffffffff814756b3>] ? device_resume+0x93/0x1d0 [108109.324429] [<ffffffff81475804>] ? async_resume+0x14/0x40 [108109.324430] [<ffffffff810aaabd>] ? async_run_entry_fn+0x2d/0x120 [108109.324433] [<ffffffff8109eb58>] ? process_one_work+0x158/0x410 [108109.324434] [<ffffffff8109f376>] ? worker_thread+0x116/0x510 [108109.324435] [<ffffffff810c11ec>] ? __wake_up_common+0x4c/0x80 [108109.324436] [<ffffffff8109f260>] ? init_pwq+0x160/0x160 [108109.324437] [<ffffffff810a538c>] ? kthread+0xbc/0xe0 [108109.324439] [<ffffffff810a0000>] ? workqueue_sysfs_register+0x110/0x150 [108109.324440] [<ffffffff810a52d0>] ? kthread_freezable_should_stop+0x60/0x60 [108109.324442] [<ffffffff81741aac>] ? ret_from_fork+0x7c/0xb0 [108109.324443] [<ffffffff810a52d0>] ? kthread_freezable_should_stop+0x60/0x60 Thanks, Johan -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html