-----Original Message----- > On 9/1/20 8:35 PM, HAGIO KAZUHITO(萩尾 一仁) wrote: > > -----Original Message----- > >> Calls to x86_64_exception_frame() with combined items set in the flags > >> argument that include EFRAME_VERIFY do not have the EFRAME_VERIFY > >> operation performed. I have some cores where multiple cases of > >> attempting to read a not-present pt_regs end a single PID backtrace with > >> a failure. One instance has the pt_regs read overrunning stacktop > >> because the pt_regs is not present and the level's stack position is > >> closer to stacktop than the size of a pt_regs. That results in a > >> backtrace failing before complete with a "seek error" at the start of > >> page after stacktop: > >> > >> crash> bt 7456 > >> PID: 7456 TASK: ffff933fdb960000 CPU: 0 COMMAND: "sh" > >> #0 [fffffe0000009e58] crash_nmi_callback at ffffffff93260e93 > >> ... > >> #9 [ffffaea5c0003f80] hrtimer_interrupt at ffffffff933313d5 > >> bt: seek error: kernel virtual address: ffffaea5c0004000 type: "pt_regs" > >> crash> > >> > >> The correct backtrace would reach level #12 with no seek error. > >> > >> The condition to perform the EFRAME_VERIFY operation tests if the flags > >> value equals EFRAME_VERIFY, not if the value includes EFRAME_VERIFY. The > >> call to x86_64_exception_frame() in x86_64_print_stack_entry() performed > >> when eframe_check >= 0 supplies a flags value of EFRAME_PRINT | > >> EFRAME_VERIFY. > >> > >> In the bt example above backtrace reaches level #9, 128 bytes from the > >> top of the current stack's pages in an IRQ stack and with a function > >> name ending in "_interrupt". This leads to x86_64_print_stack_entry() > >> setting eframe_check to zero and x86_64_exception_frame() being called > >> with flags EFRAME_PRINT | EFRAME_VERIFY. x86_64_exception_frame() > >> doesn't perform the verify because flags is not just EFRAME_VERIFY. An > >> attempt is made to read 168 bytes (SIZE(pt_regs) - 8 bytes) from a > >> position 128 bytes from the top of the stack. The stack in question is > >> followed by a not-present page and the read fails attempting to read > >> from the page following stacktop. > >> > >> Signed-off-by: David Mair <dmair@xxxxxxxx> > > > > Thanks for the patch. > > but it seems that it stops "bt -e" and "bt -E" options working, can we > > keep them? > > Yes, they seem to be broken anyway, I'll re-post an > x86_64_exception_frame() EFRAME_VERIFY fix later when I can combine it > with a bt -e/-E solution. > > I see the bt -e case fail because x86_64_eframe_search() walks up the > userspace stack in ulongs sized steps with the loop count based on the > kernel stack size in ulongs. x86_64_eframe_search() does this for each > loop iteration: > > if (x86_64_exception_frame(EFRAME_SEARCH|EFRAME_PRINT|EFRAME_VERIFY, 0, > (char *)up, bt, fp)) > > The multiple flags mean crash master doesn't do the requested > EFRAME_VERIFY (per my previous post). If the EFRAME_VERIFY is performed > then the second argument kvaddr being the zero constant would fail at > the beginning of x86_64_exception_frame() because the supplied kvaddr is > constant zero: > > if (flags & EFRAME_VERIFY) { > if (!accessible(kvaddr) || > !accessible(kvaddr + SIZE(pt_regs) - sizeof(long))) > return FALSE; > > basically, the supplied kvaddr argument ensures EFRAME_VERIFY will fail > before even trying an EFRAME_SEARCH and an EFRAME_PRINT will never be > reached. For every step of the for loop through the userspace stack. > > The EFRAME_VERIFY block in x86_64_exception_frame() does not verify the > local argument. In current form, if EFRAME_VERIFY is actually performed, > you must supply a kvaddr to find/print an exception on the userspace > stack. Plus, the size assumed for the userspace stack by > x86_64_eframe_search() is the computed size of the kernel stack. Thank you for the description, I see. > > > Also, I could not apply the patch as is because of a few format issues. > > I think your MTA or something replaced tabs with spaces and inserted a > > new line between the "ulong kvaddr," and "char *local,". We can fix > > this when applying but if you update the patch for the above comment, > > please fix it as well (if possible). It would be helpful. > > I'm sorry, a clipboard copy from git diff output in a console and paste > into an e-mail, my error. I'll re-post it correctly when I have a bt > -e/-E fix. Thanks. Kazu -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility