----- Original Message ----- > Hello Dave, > > Thanks for your observations. > > I'll fix unwind_x86_64.h to prevent this build warning: > > > > # make extensions > > ... > > gcc -Wall -I.. -I./libgcore -fPIC -DX86_64 -c -o > > libgcore/gcore_x86.o libgcore/gcore_x86.c > > In file included from libgcore/gcore_x86.c:19: > > ../unwind_x86_64.h:61:1: warning: "offsetof" redefined > > In file included from libgcore/gcore_x86.c:17: > > ../defs.h:60:1: warning: this is the location of the previous > > definition > > ... > > > > The warning is caused by IO_BITMAP_OFFSET that is defined but unused > in gcore_x86.c. So, it seems to me that part to be fixed is > gcore_x86.c, not unwind_x86_64.h. Maybe, but it should also be fixed in unwind_x86_64.h like this: --- unwind_x86_64.h 30 Nov 2010 19:40:30 -0000 1.4 +++ unwind_x86_64.h 24 Jan 2011 20:54:25 -0000 1.5 @@ -58,7 +58,9 @@ extern void init_unwind_table(void); extern void free_unwind_table(void); +#ifndef offsetof #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) +#endif #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) #define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); })) Your module is the first C source file that #include's defs.h and then unwind_x86_64.h. The change above to unwind_x86_64.h just does the same thing as defs.h. > > > But the gcore.mk file should gracefully fail to build on non-supported > > architectures. It ends up spewing ~200 lines of error messages when > > attempted, for example, on a ppc64 machine: > > Yes, I know it behaves like this if we make it run on unsupported > architectures. I'd understood it was implicitly permitted by looking > at similar build error of sial. But if it's wrong in fact, I'll make > it buildable on unsupported architectures. Or you could just catch it in the gcore.mk by doing something like this: ARCH=UNSUPPORTED ifeq ($(shell arch), x86_64) ARCH=SUPPORTED endif ifeq ($(shell arch), i686) ARCH=SUPPORTED endif all: gcore.so gcore.so: gcore.c @if [ ${ARCH} = "UNSUPPORTED" ]; then \ echo "gcore: architecture not supported"; else \ echo "do build here..."; fi; > > gcore includes part that can be shared commonly among different > architectures. This is mostly equal to anything but part of collecting > kinds of note information that are inherently architecture speciffic. > > I'll fix here so that gcore on unsupported architectures are providing > ELF core only with PT_LOAD sections. > > > > > Your documentation implies that the command would only work on > > certain kernel versions: > > > >> Compared with the previous version, this release: > >> - supports more kernel versions, and > >> - collects register values more accurately (but still not perfect). > >> > >> Support Range > >> ============= > >> > >> |----------------+----------------------------------------------| > >> | ARCH | X86, X86_64 | > >> |----------------+----------------------------------------------| > >> | Kernel Version | RHEL4.8, RHEL5.5, RHEL6.0 and Vanilla 2.6.36 | > >> |----------------+----------------------------------------------| > > > > > > But, for example, on a 2.6.34-2.fc14 kernel (presumably unsupported), > > it seems to work OK on some tasks, but on others it doesn't work so well. > > Here, the "less" command can be dumped OK kernel: > > > > > > crash> sys | grep RELEASE > > RELEASE: 2.6.34-2.fc14.x86_64 > > crash> ps > > ... [ cut ] ... > > > 2080 1490 0 ffff880079ed2480 RU 7.6 289900 159684 crash > > 2084 1 0 ffff880077a7a480 IN 0.1 248592 1936 rsyslogd > > 2090 2080 5 ffff880079ed4900 IN 0.0 105432 828 less > > crash> gcore -v0 2090 > > Saved core.2090.less > > crash> > > > > But with the same (full) 2.6.34-2.fc14 dumpfile, it can't seem to handle > > dumping the crash utility itself, and just hangs: > > > > crash> swap > > FILENAME TYPE SIZE USED PCT PRIORITY > > /dev/dm-1 PARTITION 18579452k 0k 0% -1 > > crash> ps > > ... [ cut ] ... > > > 2080 1490 0 ffff880079ed2480 RU 7.6 289900 159684 crash > > 2084 1 0 ffff880077a7a480 IN 0.1 248592 1936 rsyslogd > > 2090 2080 5 ffff880079ed4900 IN 0.0 105432 828 less > > crash> gcore -v1 2080 > > gcore: Restoring the thread group ... > > gcore: done. > > gcore: Retrieving note information ... > > > > < hangs forever > > > > > ... > > > > I would have thought that it would either work-for-all or work-for-none > > with respect to a particular kernel version? > > Sorry, I have no idea on what you mean by ``work-for-all or work-for-none''. > ``supported kernel versions'' stands for ``I tested gcore > extension module on these kernels''. There's possibility for gcore to > work well even on differnet kernel versions if there's no > incompatibility among the kernel versions. But the "less" and "crash" command examples were from the same dumpfile, so I didn't understand whey gcore would work for one command, but not for another command -- from the same kernel version? > > > > In any case, if it's going to fail, perhaps there should be some mechanism > > in place that would prevent it from hanging, and instead print a message > > that the kernel version is not supported? Or if a particular data structure > > is different than the "supported" versions, it should fail immediately? > > Just a thought... > > I agree to the former idea. I believe gcore has an enough chanse to > work well on unsupported kernels. > > The hanging part is likely to be restore_frame_pointer() that runs > only when the analized kernel is built with CONFIG_FRAME_POINTER=y and > user-space frame pointer is available by looking at the base pointer > in order. > > If kernel stack frame is in mess condition, unwinding behaviour by the > function can be performed in any unexpected way. > > I'll fix here by adding some degree that limits the number of tracing > to some finite number. Kernel stack size would be enough here. > > > > > Also I note that "gcore -v7" fails -- shouldn't it be accepted as an > > argument? > > > > crash> gcore -v7 2080 > > gcore: invalid vlevel: 7. > > crash> > > Oh, sorry. This is just a bug that should be removed my unit testing. > Thanks. > > I'll post again fixed version soon. Please wait for a while. OK thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility