From: Dave Anderson <anderson@xxxxxxxxxx> Subject: Re: [ANNOUNCE][RFC] gcore extension module: user-mode process core dump Date: Fri, 28 Jan 2011 09:31:50 -0500 (EST) > > > ----- Original Message ----- > >> Also, I have a question about the fact that gcore hanged during the >> process of gathering note information. >> >> I attempted reproducing the bug on 2.6.35.10-74.fc14.x86_64 with >> crash-5.0.6-2.fc14.x86_64 and crash-5.1.1, but it have not been >> reproduced yet: gcore worked well for both crash versions. >> >> I then retried using 2.6.34-2.fc14.x86_64 but failed to boot on the >> same environment as in 2.6.35.10-74.fc14.x86_64. >> >> So, questions I have are: In what kind of environments did you face >> the hang? I want to and need to set up the same environment as >> yours. In Fedora Alpha, its kernel version was already 2.6.35 >> according to the release notes: >> >> http://fedoraproject.org/wiki/Fedora_14_Alpha_release_notes#Linux_Kernel_2.6.35 >> >> Also, it is helpful if you show me a backtrace during gcore hanging. > > I retested it with the latest gcore.tar.bz2 using the same fc14 dumpfile > and it works OK. > That's a good news. I've got confirmed the cause is in restore_frame_pointer(). > I did re-verify that it hangs with the older version: > > # ls -l /root/gcore.tar.bz2 gcore.tar.bz2 > -rw-r--r-- 1 root root 28666 Jan 24 11:05 /root/gcore.tar.bz2 <- hangs > -rw-r--r-- 1 root root 29266 Jan 27 10:15 gcore.tar.bz2 <- works OK > # > > (gdb) bt > #0 0x0000003e838cd6a0 in __lseek_nocancel () from /lib64/libc.so.6 > #1 0x0000000000534fd8 in read_netdump (fd=-1, bufptr=0x7fffeb5977e0, cnt=8, addr=18446612134417074248, paddr=2102855752) > at netdump.c:526 > #2 0x000000000053b663 in read_kdump (fd=-1, bufptr=0x7fffeb5977e0, cnt=8, addr=18446612134417074248, paddr=2102855752) > at netdump.c:2553 > #3 0x000000000046bc1b in readmem (addr=18446612134417074248, memtype=1, buffer=0x7fffeb5977e0, size=8, > type=0x2b95faf6d370 "restore_frame_pointer: resume rbp", error_handle=5) at memory.c:1849 > #4 0x00002b95faf6980c in restore_frame_pointer () from ./extensions/gcore.so > #5 0x00002b95faf6a196 in restore_rest () from ./extensions/gcore.so > #6 0x00002b95faf69d51 in genregs_get () from ./extensions/gcore.so > #7 0x00002b95faf6585c in fill_thread_core_info () from ./extensions/gcore.so > #8 0x00002b95faf65ccc in fill_note_info () from ./extensions/gcore.so > #9 0x00002b95faf64755 in gcore_coredump () from ./extensions/gcore.so > #10 0x00002b95faf6a95e in do_gcore () from ./extensions/gcore.so > #11 0x00002b95faf6a7f9 in cmd_gcore () from ./extensions/gcore.so > #12 0x0000000000454631 in exec_command () at main.c:674 > #13 0x00000000004544de in main_loop () at main.c:633 > #14 0x0000000000578b39 in captured_command_loop (data=0x3) at ./main.c:226 > #15 0x0000000000577cfb in catch_errors (func=0x578b30 <captured_command_loop>, func_args=0x0, errstring=0x82092c "", > mask=<value optimized out>) at exceptions.c:520 > #16 0x0000000000579286 in captured_main (data=<value optimized out>) at ./main.c:924 > #17 0x0000000000577cfb in catch_errors (func=0x578b70 <captured_main>, func_args=0x7fffeb597f70, errstring=0x82092c "", > mask=<value optimized out>) at exceptions.c:520 > #18 0x00000000005788d4 in gdb_main (args=0x7d56fb40) at ./main.c:939 > #19 0x0000000000578916 in gdb_main_entry (argc=<value optimized out>, argv=0x7d56fb40) at ./main.c:959 > #20 0x00000000004d2b7d in gdb_main_loop (argc=2, argv=0x7fffeb598478) at gdb_interface.c:78 > #21 0x0000000000454281 in main (argc=3, argv=0x7fffeb598478) at main.c:547 > (gdb) Thanks for giving me a backtrace. It helps a lot. It looks to me that restore_frame_pointer() loops here during the trivial operation of tracing frame pointers on the stack. I guess from the situation that the values of frame pointer are looping on the kernel stack. Some of a serise of frame pointers are broken? > > If you're still interested, I can make the vmlinux/vmcore available to you. I'm still interested in that. Could you provide me with them? I need to figure out exact situtation of kernel stack relevant to the behaviour of restore_frame_pointer(). Thanks, HATAYAMA Daisuke -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility