Re: reporting SMP/ACPI boot lockup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Jul 10, 2007 at 01:45:54PM -0400, timotheus wrote:
> I am looking to gather enough information about my kernel to report a rather
> troublesome kernel bug; but am stuck on how to debug the issue.
> 
> My machine is a Core 2 Duo T7200 laptop:
>     http://tstotts.net/linux/gentooinsp640m.html
> 
> The issue is this:
> 
>    At early boot-time the computer has an occasional case of halting to the
>    CPU (freezing); requiring a physical reset and fresh attempt to power on.
>    Occurs perhaps 1 of 8 boots.
> 
>    With kernel 2.6.21, the last messages displayed on the console were those
>    of SMP hotpluging -- something along the lines of hotplugging CPU #1 and
>    enabling NOHZ operation. The messages show no errors; it just deadlocks.

CPU hotplug with a laptop? I don't think you can hotplug an extra core
into a Core2 Duo CPU.

>    With kernel 2.6.22, the last messages displayed on the console are mostly
>    about ACPI flags/registers/values being incorrect/unexpected, with an
>    explicit error message.

The explicit error message is very interesting for the linux-acpi list.
It might also be a BIOS problem, try updating the BIOS and see if that
solves the problem.

>    The deadlock always occurs somewhat before the initialization of the VESA
>    driver, but appears completely unrelated to the video.
> 
>    The deadlock appears to be directly related to one or more of: CPU
>    hotplugging, NOHZ timer, generic ACPI support, generic PNP support.
> 
> Any recommendations on how to do the following would be greatly appreciated:
> 
>    Determine the driver that deadlocks.

Enable Soft Lockup detection, RT Mutex debugging, Lock Debugging, and
Spinlock debugging (all under Kernel hacking --> Kernel debugging). Be
sure to compile the kernel with frame pointers (also under kernel
hacking) so the backtraces will be reliable.

Playing with the various sysrq combinations might also give interesting
information about what's wrong.

>    Log the very early boot console text to a file for easy copy-and-paste into
>    Bugzilla. (The text is always earlier that the vga= initialization
>    routine.)

Try serial console or netconsole. Or take a picture of the screen with
a digital camera.


Erik

- -- 
They're all fools. Don't worry. Darwin may be slow, but he'll
eventually get them. -- Matthew Lammers in alt.sysadmin.recovery
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGlJYW/PlVHJtIto0RAjuQAJ9FhV7+brM5a5MeADXIVjKbuXnuYgCgj38u
RkDR6GNTPYhgxc1JiRynhw4=
=hvCs
-----END PGP SIGNATURE-----

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ


[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux