On Wednesday 11 November 2009, Ferenc Wagner wrote: > "Rafael J. Wysocki" <rjw@xxxxxxx> writes: > > > On Wednesday 11 November 2009, Ferenc Wagner wrote: > > > >> "Rafael J. Wysocki" <rjw@xxxxxxx> writes: > >> > >>> On Thursday 29 October 2009, Ferenc Wagner wrote: > >>> > >>>> "Rafael J. Wysocki" <rjw@xxxxxxx> writes: > >>>> > >>>>> On Wednesday 28 October 2009, Ferenc Wagner wrote: > >>>>> > >>>>>> 2.6.32-rc5 feels particularly bad, with frequent failures to switch > >>>>>> off the machine after "S|" or freezes after "Snapshotting system". > >>>>>> The former does not cause much trouble in itself, as the machine can > >>>>>> be switched off and resumed all right, but the latter is nasty. > >>>>>> Suspend to RAM works all the time. The issue is not reproducible, > >>>>>> unfortunately, and the kernel change happened almost together with a > >>>>>> BIOS upgrade. Yesterday I switched back to 2.6.31 to see whether it > >>>>>> still works stably with the new BIOS. I'll report back my findings in > >>>>>> a couple of days. > >>>>> > >>>>> OK, thanks. > >>>>> > >>>>> Still, I'm really afraid we won't be able to debug it any further without a > >>>>> reproducible test case. > >>>> > >>>> Can't you perhaps suggest a way forward there? Or some tricks to create a > >>>> reproducible test case here? > >>> > >>> Well, you can test if the problem is reproducible in the "shutdown" mode of > >>> hibernation. > >> > >> Well, both failure modes happen with "shutdown" mode as well (the S| > >> freeze with yesterday's git, too), but still not reproducibly. When > >> s2disk is stuck in "Snapshotting system", the system is not completely > >> dead, it echoes line feeds and Ctrl-C at least (as added to #14504). > >> > >> I wonder what you did if the issue was reproducible... Is that totally > >> unapplicable if the problem happens with 10% probability only? Slow, > >> sure, but until I manage to set up an automated testing bench... > > > > I would try to identify the commit that made the problem appear using git > > bisection. However, this is really difficult with problems that are not > > reliably reproducible. > > Indeed. I'm thinking about setting up a script, which does nothing but > hibernates the laptop in a loop, and get my router provide a constant > stream of WOL packets to restart it. If it always freezes in bounded > time that will make bisecting possible, if slow. Alternatively, you can use the RTC alarm to wake up the machine. > > Failing that, I would add some instrumentation to the code to identify the > > exact place where it hangs. > > I managed to achieve this with my STR problem, see > http://bugs.freedesktop.org/show_bug.cgi?id=22126#c17, but maybe that > status = acpi_evaluate_object(NULL, METHOD_NAME__PTS, &arg_list, NULL); > wasn't deep enough, as it got no followup. How deep should one go to be > useful? No, this is deep enough and indicates a BIOS issue. > I can probably do so again, if slower; but this case may also be easier > if I can depend on working console output. Which are the interesting > parts for instrumentation? Can those parts produce console output to > VGA or netconsole? Wouldn't switching on ACPI debugging before invoking > s2disk be useful? Which parts of it (to avoid it spitting out MBs of > useless characters)? I usually don't do that and if the issue is reproducible in the "shutdown" mode, ACPI is most probably not involved. > > BTW, did you carry out the /sys/power/pm_test "core" test on the box? > > I'm not clear on how to do that with user space suspend. Simply set it > to "cores" before invoking s2disk? Yes, echo "core" to /sys/power/pm_test before executing s2disk. > I already did the test for STR (see > http://bugs.freedesktop.org/show_bug.cgi?id=22126#c3), but will redo > with the current kernel tonight. OK, thanks. Best, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html