----- Original Message ----- > We have a new hardware to do dump, and use makedumpfile to generate > vmcore. Our hardware can work when the OS is out of controll(for > example: dead loop). When we use crash to analyze the vmcore, bt can > not work, because there is no panic task. > > We have provide the value of register in the vmcore(the format is > elf_prstatus, it is same with normal kdump's vmcore). So we can use > it when we do not find panic task. Let me state first that I get very nervous when patches are made to commonly used functions to handle special cases such as this. When those kinds of changes are proposed, I much prefer that the new code *only* apply to the special cases, and should not affect anything else. I do not have any x86 or x86_64 compressed kdumps that contain ELF note data, so I cannot really test this patch. However, when I do run this patch against a sample set of ~150 x86_64 dumpfiles, I see errors on simple backtraces such as the following examples. Without the patch: crash> bt ffff81007e54c100 PID: 0 TASK: ffff81007e54c100 CPU: 1 COMMAND: "swapper" #0 [ffff81007e565ef0] cpu_idle at ffffffff80047282 crash> With your patch: crash> bt ffff81007e54c100 PID: 0 TASK: ffff81007e54c100 CPU: 1 COMMAND: "swapper" Segmentation fault And it doesn't work with Xen kernels very well. Without the patch: crash> bt ffff8800009c0040 PID: 0 TASK: ffff8800009c0040 CPU: 1 COMMAND: "swapper" #0 [ffff88001d485ef8] safe_halt at ffffffff8011004c #1 [ffff88001d485f28] xen_idle at ffffffff801092f4 #2 [ffff88001d485f38] cpu_idle at ffffffff801093b5 crash> With your patch: crash> bt ffff8800009c0040 PID: 0 TASK: ffff8800009c0040 CPU: 1 COMMAND: "swapper" bt: cannot determine starting stack pointer crash> Without the patch: crash> bt -a ... [ cut ] ... PID: 0 TASK: ffff880000017040 CPU: 1 COMMAND: "swapper" #0 [ffff880033983f38] xen_idle at ffffffff8026c502 #1 [ffff880033983f48] cpu_idle at ffffffff8024a982 PID: 0 TASK: ffff8800000037e0 CPU: 2 COMMAND: "swapper" #0 [ffff880033987f38] xen_idle at ffffffff8026c502 #1 [ffff880033987f48] cpu_idle at ffffffff8024a982 PID: 0 TASK: ffff880000003080 CPU: 3 COMMAND: "swapper" #0 [ffff880033989f38] xen_idle at ffffffff8026c502 #1 [ffff880033989f48] cpu_idle at ffffffff8024a982 With your patch: crash> bt -a ... [ cut ] ... PID: 0 TASK: ffff880000017040 CPU: 1 COMMAND: "swapper" bt: cannot determine starting stack pointer PID: 0 TASK: ffff8800000037e0 CPU: 2 COMMAND: "swapper" bt: cannot determine starting stack pointer PID: 0 TASK: ffff880000003080 CPU: 3 COMMAND: "swapper" bt: cannot determine starting stack pointer crash> So my point is that given that *none* of these kernels even contain makedumpfile-generated ELF data -- so your code should *not* affect them at all. That being the case, I cannot accept the patch as it is currently written. Things work fine as things are now, and I'm not interested in debugging things that break because your patch changes current behavior. Here are my two major suggestions: (1) If it is determined during the initial dumpfile scan that it contains ELF note data, then set a global flag, perhaps in the new pc->flags2, something like MAKEDUMPFILE_ELF_NOTES. (2) Then, segregate your changes *completely* based upon that flag. I would rather have essentially-redundant code put in place instead of breaking currently-existing code. That way, for all dumpfiles where the MAKEDUMPFILE_ELF_NOTES flag is *not* set, then your code should *not* run. And here are a few specific comments: (1) The machdep->process_elf_notes handler was put in place for this kind of thing, so please continue to use it for x86 and x86_64 in the same way that the s390x architecture does. You don't need to pass the machine type as an extra argument, given that "machine_type()" can be used anywhere. The two arches can share the same handler. (2) get_netdump_regs_x86_64: segregate the code based upon the MAKEDUMPFILE_ELF_NOTES flag. (3) get_netdump_regs_x86(): segregate the code based upon the MAKEDUMPFILE_ELF_NOTES flag. (4) Make a new map_cpus_to_prstatus()-type function for this kind of dumpfile. In task_init(), you can check the MAKEDUMPFILE_ELF_NOTES flag and call the new function, and so you won't need your new KDUMP_CMPRS_DUMPFILE() function. (5) x86_64_get_dumpfile_stack_frame(): use the MAKEDUMPFILE_ELF_NOTES flags instead of bt->machdep. And two final requests: (1) Before posting the patch, please build your changes with "make warn". Your changes generate a few warnings. (2) Can make your patches attachments to your email instead of inline? Thanks, Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility