----- "Dave Anderson" <anderson@xxxxxxxxxx> wrote: > ----- "Hu Tao" <hutao@xxxxxxxxxxxxxx> wrote: > > > Hi Dave, > > > > There is a bug on get_be_long() that causes high 32 bits truncated. > > As a result, we get wrong registers values from dump file. Patch 1 > > fixes this. > > Good catch! > > > Once we can get right cpu registers values, it's better to use the > > sp/ip for backtracing the active task. This can show a more accurate > > backtrace, not including those invalid frames beyond sp. pathes 2 and > > 3 do this on kvmdump case(virsh dump). > > > > To verify: run that km_probe.c test module on a x86_64 system, then > > `echo q > /proc/sysrq-trigger' to trigger the kprobe which does > > looping in post_handler. Then vrish dump then crash. > > However, I'm wondering whether you actually tested this with a > *crashed* system's dumpfile, and not just with a *live* system dump with > a contrived set of circumstances. And if so, what differences, if any, > did you see with the backtraces of the task that either oops'd or called > panic(), as well as those of the other active tasks that were brought down > by IP interrupt? > > Anyway, I'll give this a thorough testing with a set of sample > dumpfiles that I have on hand. Actually, the patch fails miserably on SMP dumpfiles with segmentation during initialization. And now looking at your patch, I'm wondering whether you even tested this with an SMP system? The change to qemu_load() here: @@ -904,6 +906,9 @@ qemu_load (const struct qemu_device_loader *devices, uint32_t required_features, if (feof (fp) || ferror (fp)) break; + if (STREQ(d->vtbl->name, "cpu")) + result->dx86 = d; + if (sec == QEMU_VM_SECTION_END || sec == QEMU_VM_SECTION_FULL) result->features |= features; } That function cycles through the "cpu" devices for *each* cpu in the system, so this patch will store the device of the last cpu device it encounters. So in an SMP machine, it will store the device for the highest cpu only, right? And then there's this change to get_kvmdump_regs(): @@ -310,7 +311,11 @@ kvmdump_memory_dump(FILE *ofp) void get_kvmdump_regs(struct bt_info *bt, ulong *pc, ulong *sp) { - machdep->get_stack_frame(bt, pc, sp); + if (is_task_active(bt->task)) { + *sp = device_list->dx86->regs[R_ESP]; + *pc = device_list->dx86->eip; + } else + machdep->get_stack_frame(bt, pc, sp); } is_task_active() returns TRUE for all active tasks in an SMP system, not just the panic task. So it would seem you're going to pass back the same registers for all active tasks? And what's the point of this change to kernel.c? diff --git a/kernel.c b/kernel.c index e399099..2627020 100755 --- a/kernel.c +++ b/kernel.c @@ -16,6 +16,7 @@ */ #include "defs.h" +#include "qemu-load.h" #include "xen_hyper_defs.h" #include <elf.h> Also, the change to main.c is unnecessary -- there are dozens of malloc'd memory areas in the program -- so why go to the bother of free'ing just this one prior to exiting? Anyway, instead of saving the device list, I suggest you do something like storing the per-cpu IP/SP values in a separate data structure that can possibly be used as an alternative source for register values for "live dumps" -- and possibly for crashing systems if usable starting hooks cannot be determined in the traditional manner. I had thought of doing something like that in the past, but when I looked at the register values, I must have run into the get_be_long() issue? Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility