Larry McVoy's advice on how to manually bisect 1.3.x kernel bugs is of historical interest, but that's what the repository is for. It is not useful to users now. Signed-off-by: Jonathan Corbet <corbet@xxxxxxx> --- Documentation/admin-guide/bug-hunting.rst | 249 ------------------------------ Documentation/admin-guide/index.rst | 1 - 2 files changed, 250 deletions(-) delete mode 100644 Documentation/admin-guide/bug-hunting.rst diff --git a/Documentation/admin-guide/bug-hunting.rst b/Documentation/admin-guide/bug-hunting.rst deleted file mode 100644 index d35dd9fd1af0..000000000000 --- a/Documentation/admin-guide/bug-hunting.rst +++ /dev/null @@ -1,249 +0,0 @@ -Bug hunting -+++++++++++ - -Last updated: 20 December 2005 - -Introduction -============ - -Always try the latest kernel from kernel.org and build from source. If you are -not confident in doing that please report the bug to your distribution vendor -instead of to a kernel developer. - -Finding bugs is not always easy. Have a go though. If you can't find it don't -give up. Report as much as you have found to the relevant maintainer. See -MAINTAINERS for who that is for the subsystem you have worked on. - -Before you submit a bug report read -:ref:`Documentation/admin-guide/reporting-bugs.rst <reportingbugs>`. - -Devices not appearing -===================== - -Often this is caused by udev. Check that first before blaming it on the -kernel. - -Finding patch that caused a bug -=============================== - - - -Finding using ``git-bisect`` ----------------------------- - -Using the provided tools with ``git`` makes finding bugs easy provided the bug -is reproducible. - -Steps to do it: - -- start using git for the kernel source -- read the man page for ``git-bisect`` -- have fun - -Finding it the old way ----------------------- - -[Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@xxxxxxx (Larry McVoy)] - -This is how to track down a bug if you know nothing about kernel hacking. -It's a brute force approach but it works pretty well. - -You need: - - - A reproducible bug - it has to happen predictably (sorry) - - All the kernel tar files from a revision that worked to the - revision that doesn't - -You will then do: - - - Rebuild a revision that you believe works, install, and verify that. - - Do a binary search over the kernels to figure out which one - introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but - you know that 1.3.69 does. Pick a kernel in the middle and build - that, like 1.3.50. Build & test; if it works, pick the mid point - between .50 and .69, else the mid point between .28 and .50. - - You'll narrow it down to the kernel that introduced the bug. You - can probably do better than this but it gets tricky. - - - Narrow it down to a subdirectory - - - Copy kernel that works into "test". Let's say that 3.62 works, - but 3.63 doesn't. So you diff -r those two kernels and come - up with a list of directories that changed. For each of those - directories: - - Copy the non-working directory next to the working directory - as "dir.63". - One directory at time, try moving the working directory to - "dir.62" and mv dir.63 dir"time, try:: - - mv dir dir.62 - mv dir.63 dir - find dir -name '*.[oa]' -print | xargs rm -f - - And then rebuild and retest. Assuming that all related - changes were contained in the sub directory, this should - isolate the change to a directory. - - Problems: changes in header files may have occurred; I've - found in my case that they were self explanatory - you may - or may not want to give up when that happens. - - - Narrow it down to a file - - - You can apply the same technique to each file in the directory, - hoping that the changes in that file are self contained. - - - Narrow it down to a routine - - - You can take the old file and the new file and manually create - a merged file that has:: - - #ifdef VER62 - routine() - { - ... - } - #else - routine() - { - ... - } - #endif - - And then walk through that file, one routine at a time and - prefix it with:: - - #define VER62 - /* both routines here */ - #undef VER62 - - Then recompile, retest, move the ifdefs until you find the one - that makes the difference. - -Finally, you take all the info that you have, kernel revisions, bug -description, the extent to which you have narrowed it down, and pass -that off to whomever you believe is the maintainer of that section. -A post to linux.dev.kernel isn't such a bad idea if you've done some -work to narrow it down. - -If you get it down to a routine, you'll probably get a fix in 24 hours. - -My apologies to Linus and the other kernel hackers for describing this -brute force approach, it's hardly what a kernel hacker would do. However, -it does work and it lets non-hackers help fix bugs. And it is cool -because Linux snapshots will let you do this - something that you can't -do with vendor supplied releases. - -Fixing the bug -============== - -Nobody is going to tell you how to fix bugs. Seriously. You need to work it -out. But below are some hints on how to use the tools. - -To debug a kernel, use objdump and look for the hex offset from the crash -output to find the valid line of code/assembler. Without debug symbols, you -will see the assembler code for the routine shown, but if your kernel has -debug symbols the C code will also be available. (Debug symbols can be enabled -in the kernel hacking menu of the menu configuration.) For example:: - - objdump -r -S -l --disassemble net/dccp/ipv4.o - -.. note:: - - You need to be at the top level of the kernel tree for this to pick up - your C files. - -If you don't have access to the code you can also debug on some crash dumps -e.g. crash dump output as shown by Dave Miller:: - - EIP is at ip_queue_xmit+0x14/0x4c0 - ... - Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00 - 00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08 - <8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85 - - Put the bytes into a "foo.s" file like this: - - .text - .globl foo - foo: - .byte .... /* bytes from Code: part of OOPS dump */ - - Compile it with "gcc -c -o foo.o foo.s" then look at the output of - "objdump --disassemble foo.o". - - Output: - - ip_queue_xmit: - push %ebp - push %edi - push %esi - push %ebx - sub $0xbc, %esp - mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb) - mov 0x8(%ebp), %ebx ! %ebx = skb->sk - mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt - -In addition, you can use GDB to figure out the exact file and line -number of the OOPS from the ``vmlinux`` file. If you have -``CONFIG_DEBUG_INFO`` enabled, you can simply copy the EIP value from the -OOPS:: - - EIP: 0060:[<c021e50e>] Not tainted VLI - -And use GDB to translate that to human-readable form:: - - gdb vmlinux - (gdb) l *0xc021e50e - -If you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function -offset from the OOPS:: - - EIP is at vt_ioctl+0xda8/0x1482 - -And recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled:: - - make vmlinux - gdb vmlinux - (gdb) p vt_ioctl - (gdb) l *(0x<address of vt_ioctl> + 0xda8) - -or, as one command:: - - (gdb) l *(vt_ioctl + 0xda8) - -If you have a call trace, such as:: - - Call Trace: - [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5 - [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e - [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee - ... - -this shows the problem in the :jbd: module. You can load that module in gdb -and list the relevant code:: - - gdb fs/jbd/jbd.ko - (gdb) p log_wait_commit - (gdb) l *(0x<address> + 0xa3) - -or:: - - (gdb) l *(log_wait_commit + 0xa3) - - -Another very useful option of the Kernel Hacking section in menuconfig is -Debug memory allocations. This will help you see whether data has been -initialised and not set before use etc. To see the values that get assigned -with this look at ``mm/slab.c`` and search for ``POISON_INUSE``. When using -this an Oops will often show the poisoned data instead of zero which is the -default. - -Once you have worked out a fix please submit it upstream. After all open -source is about sharing what you do and don't you want to be recognised for -your genius? - -Please do read -ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though -to help your code get accepted. diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index 2872c0c70ea4..2d0a302e8773 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -25,7 +25,6 @@ problems and bugs in particular. reporting-bugs security-bugs - bug-hunting oops-tracing ramoops dynamic-debug-howto -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html