On Thu, 2008-02-07 at 11:27 -0500, Dave Anderson wrote: > Andrew Hecox wrote: > > On Thu, 2008-02-07 at 10:32 -0500, Dave Anderson wrote: > >> Andrew Hecox wrote: > >>> hello, > >>> > >>> I'm looking at a customer issue where diskdumpmsg is unable to read a > >>> vmcore file. It is not clear if this a problem with the vmcore file or > >>> diskdumpmsg. I can load the vmcore with crash and in my naive usage of > >>> it, can see no problems. However, I'm new to the tool so that doesn't > >>> give me a lot of confidence. > >>> > >>> Does anyone have any suggestions on how or if I can use crash to help > >>> determine if there's corruption in the vmcore file? Or any other way of > >>> approaching the problem? > >>> > >>> Thanks much, > >>> > >>> Andrew > >>> > >> I'm not sure what you expect the crash utility to do -- if it comes > >> up to a prompt with no error or warning messages, it means that the > >> ELF header contains what appears to be valid usable information, > >> and that the minimum kernel memory contents required to set up the > >> crash utility's notion of the running system are all in place. That's > >> not to say that there is no chance that the vmcore contains some > >> corruption that was not recognized. > >> > > > > Thanks. Any other suggestions on how to determine if a vmcore is "valid" > > or is that not even a reasonable question to try and ask? The problem > > I'm trying to solve is described better below: > > > >> With respect to diskdumpmsg, as I understand it, it was fairly recently > >> changed from a perl script to a C file so that it could be run > >> earlier in time so as to be able to use the swap partition. Looking > >> at main() in the diskdumpmsg.c file (version 1.4.1-2), there are numerous > >> error types and associated error messages. What do you mean when you > >> say that "diskdumpmsg is unable to read a vmcore file"? > > > > Specifically: > > > > - user reported a floating point exception from diskdump on startup > > - the result was reproducible locally but only with their vmcore file > > - fpe occurred in get_logbuf: > > log_end %= log_buf_len; > > - log_buf_len had been set to 0 in read_buffer > > if (!page_is_dumpable(pfn, dump->device)) { > > memset(buf, 0, copy_len); > > } else { > > - I don't know enough to say if the page really wasn't dumpable. > > static inline bool page_is_dumpable(unsigned int nr, DumpDevice *device) > > { > > return device->dumpable_bitmap[nr>>3] & (1 << (nr & 7)); > > } > > - I wrote a patch with one way to avoid the FPE (attached) and sent it > > to SEG. > > > > Now I'm trying to determine if the vmcore file should be readable by > > diskdumpmsg. In other words, is this a problem in diskdumpmsg post-crash > > or a problem with the vmcore file prior to it getting to diskdumpmsg. > > Unfortunately, I don't understand the problem domain very well at all, > > hence the probably naive questions :) > > > > Any suggestions are appreciated. > > > > -Andrew > > So it appears that the page containing the log_buf_len symbol is not > readable or contained in the dumpfile. BTW, is this a compressed > dumpfile or an ELF formatted dumpfile? And what "dump_level" did > they configure? > compressed, level is 19. > Anyway, back to the log_buf_len symbol read, what happens when you > enter the "log" command while in a crash session? It attempts to > read that symbol immediately. > I get what appears to be a full and valid dump of the kernel message buffer. -Andrew > > Dave > > > >> > >> ------------------------------------------------------------------------ > >> > >> diff -rupN diskdumputils-1.4.1.orig/diskdumpmsg.c diskdumputils-1.4.1/diskdumpmsg.c > >> --- diskdumputils-1.4.1.orig/diskdumpmsg.c 2008-02-06 14:32:41.000000000 -0500 > >> +++ diskdumputils-1.4.1/diskdumpmsg.c 2008-02-06 15:56:22.000000000 -0500 > >> @@ -208,6 +208,10 @@ static int get_logbuf(DumpFile *dump, ch > >> > >> len = log_end; > >> } else { > >> + if (!log_buf_len) { > >> + ret = READ_ERROR_IN_DUMP_FILE; > >> + goto err; > >> + } > >> log_end %= log_buf_len; > >> > >> ret = read_buffer(dump, log_buf + log_end, > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility