----- Original Message ----- > Hi Dave and list, > > I guess you still remember that a few weeks ago Wen has sent a patch which intends > to use register values in NT_PRSTATUS notes for backtracing if no panic task was > found. Thanks for your review and the suggestions are very useful. > > However, Wen was busy with other work for these days, so I'll continue with this > work and now the 2rd version patch is attached. > > v2: Address review feedbacks. > 1) Set up a flag in pc->flags2 and it's determined during the diskdump file init procedure. > 2) Seperate code according to the that flag. > 3) Also, we've done some tests on dumpfile of xen kernel and the trouble described > in the previous mail was gone. > > So we're looking forward to your feedback and if you still have any problems with > it, please tell us. > > Thanks, > Wang Chao Hello Wang, I haven't had a chance to run a complete test of this latest patch, but I do have several suggestions. diskdump.c: Put the nt_prstatus_percpu array pointer and the num_prstatus_notes into the diskdump_data structure -- like you did with the notes_buf. Make nt_prstatus_percpu a pointer that gets allocated and initialized only if necessary -- because if it's not used, it's a waste of space. In read_dump_header(), when dd->notes_buf is malloc()'d, malloc() the dd->nt_prstatus_percpu array. And at the bottom of read_dump_header(), free both the dd->notes_buf and the dd->nt_prstatus_percpu array if they exist, under the "err:" label: err: ... if (dd->notes_buf) free(dd->notes_buf); if (dd->nt_prstatus_percpu) free(dd->nt_prstatus_percpu; In order to be able to debug/understand why things may fail, especially due to the exclusion of prstatus notes of offline cpus, it would be very helpful to be able to see the dd->nt_prstatus pointers for each cpu. Similar to what is done with the vmcoreinfo data, can you add a section to __diskdump_dump_header(): if (dh->header_version >= 3) { fprintf(fp, " offset_vmcoreinfo: %lx\n", (ulong)dd->sub_header_kdump->offset_vmcoreinfo); fprintf(fp, " size_vmcoreinfo: %lu\n", dd->sub_header_kdump->size_vmcoreinfo); if (dd->sub_header_kdump->offset_vmcoreinfo && dd->sub_header_kdump->size_vmcoreinfo) { dump_vmcoreinfo(fp); } } if (dh->header_version >= 4) { fprintf(fp, " offset_note: %lx\n", (ulong)dd->sub_header_kdump->offset_note); fprintf(fp, " size_note: %lu\n", dd->sub_header_kdump->size_note); here ==================> } Call a function that dumps the file offset of each cpu's nt_prstatus data. Then, if necessary, any cpu's register set can be read with "rd -f". netdump.c: In get_netdump_regs_x86(), both of these settings are incorrect: user_regs = get_regs_from_note((char *)note, &ip, &sp); =====> bt->flags |= BT_KDUMP_ELF_REGS; if (is_kernel_text(ip) && (((sp >= GET_STACKBASE(bt->task)) && (sp < GET_STACKTOP(bt->task))) || in_alternate_stack(bt->tc->processor, sp))) { *eip = ip; *esp = sp; =====> bt->flags |= BT_KERNEL_SPACE; return; } Neither of these flags should be set there. BT_KDUMP_ELF_REGS is only applicable to x86_64, and BT_KERNEL_SPACE is only appicable to kvmdump dumpfiles. Finally, in the interest of paranoia, give the user the capability of *not* using this facility. In main.c, create a "--no_elf_notes" option (similar to "--zero_excluded"), and have it set a NO_ELF_NOTES bit in the globally-accessible "*diskdump_flags". As an example, see how ZERO_EXCLUDED is set, used, and accessed. Then in read_dump_header() skip the new ELF setup if that flag is set. It will not be necessary to go as creating a new environment variable like "zero_excluded", but it would be nice to be able to restrict its use on the crash invocation command line. Thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility