On Fri 2009-12-11 10:53:52, Mel Gorman wrote: > On Tue, Dec 08, 2009 at 12:37:36AM +0000, Alan Jenkins wrote: > > >> <SNIP> > > >> Here's a new datum: > > >> > > >> Applying this patch has left a less frequent hang. So far it has > > >> happened twice. (Once playing last night, and once today testing > > >> hibernation with KMS enabled). > > >> > > >> This hang happens at a different point. It happens _before_ writing out > > >> the hibernation image. That is, I don't see the textual progress bar, > > >> and if I force a power-cycle then it doesn't resume (and complains about > > >> uncleanly unmounted filesystems). > > >> > > >> Here is the backtrace: > > >> > > >> [top of screen] > > >> s2disk D c1c05580 0 5988 5809 0x00000000 > > >> ... > > >> Call Trace: > > >> ... > > >> ? wait_for_common > > >> ? default_wake_function > > >> ? kthread_create > > >> ? worker_thread > > >> ? create_workqueue_thread > > >> ? worker_thread > > >> ? __create_workqueue_thread > > >> ? stop_machine_create > > >> ? disable_nonboot_cpus > > >> ? hibernation_snapshot > > >> ? snapshot_ioctl > > >> ... > > >> ? sys_ioctl > > >> > > > > > Can you reconfirm that backing out both of those patches makes this 100% > > > reliable or is it just a lot harder to trigger. It does not even appear > > > that it's locked up within the page allocator at this trace message. > > > Assuming c1c05580 is where it's stuck at, where does addr2line say that > > > is (requires CONFIG_DEBUG_INFO) ? > > > > The new hang happened with only one patch applied (my "uswsusp: > > automatically free the in-memory image once s2disk has finished with > > it"). > > > > Ok. I'm learning towards believing that the system is extremely > borderline and what c1c05580 is doing is changing very slightly how many > pages are available. Why it makes a difference on uni-core, I have no > idea but it could be very small differences in available memory as it > does increase the size of some in-kernel structures. It should be very easy to test that theory, right? Just reduce PAGES_FOR_IO to 3.9MB, and if it breaks, you know system was borderline. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html