On Wed, Feb 21, 2024 at 02:08:45PM +0800, Tao Liu wrote: > Hi Aditya, > > On Tue, Feb 20, 2024 at 2:30 PM Aditya Gupta <adityag@xxxxxxxxxxxxx> wrote: > > > > Hi Tao Liu, > > > > > > > > <...snip...> > > > > > > > For now it seems to work fine on ppc64, I will go through the code > > > > changes once. By the way, can you post a rebased patch series for x86 ? > > > > > > I noticed there is some problem with the ppc64 change thread context > > > support. Currently for x86_64, we use "set <pid>" cmd to switch the > > > current task context to be the task <pid>'s, then we use cmd "gdb bt" > > > to view the stack unwinding of task <pid>, by which we can view > > > arbitrary task's stack unwinding. > > > > > > However for ppc64 code, there is no "get arbitrary tasks register" > > > code. E.g. in ppc64_get_cpu_reg(): > > > > > > task = get_active_task(cpu); > > > tc = task_to_context(task); > > > > > > So no matter how we set any pid, the registers are always the active > > > tasks' register, instead of <pid> tasks' register. > > > > Hmm. True, I will check that, true, gdb as of now will get active tasks > > registers only, as this feature was meant for that, since getting gdb to > > assist for the crashing process was the priority. > > > > Setting context to arbitrary pid ('set <pid>') and backtrace of any > > arbitrary pid works on ppc64, so there should be a way to get those > > registers, i will go through the code once to check this. > > > > But that arbitrary backtrace will be a very good plus. > > > I made some code change[1], and currently it seems to work for stack > unwinding for arbitrary tasks. > > The usage is as follows: > crash> set <pid> > or > crash> set <task> > crash> gdb bt > which will view the specific task stack unwinding. Thanks, I tested it, works okay. Actually on ppc64, for any task, eg. systemd, switching to it shows the swapper process's backtrace even now. As of now, I applied your patch above my series, thanks for the fixes for nip/sp cases, and freeing up the buffer even when returning early. crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM ... 1 0 35 c000000003a54b80 IN 0.0 176768 18432 systemd ... crash> set 1 PID: 0 COMMAND: "swapper/35" TASK: c000000003a80000 (1 of 64) [THREAD_INFO: c000000003a80000] CPU: 35 STATE: TASK_RUNNING (ACTIVE) crash> set c000000003a54b80 PID: 0 COMMAND: "swapper/35" TASK: c000000003a80000 (1 of 64) [THREAD_INFO: c000000003a80000] CPU: 35 STATE: TASK_RUNNING (ACTIVE) crash> bt PID: 0 TASK: c000000003a80000 CPU: 35 COMMAND: "swapper/35" R0: 0000000000000000 R1: c000000003be3d30 R2: c000000001782500 ... #0 [c000000003be3d30] _end at c000000003be3d70 (unreliable) #1 [c000000003be3d90] dedicated_cede_loop at c000000000edfb50 #2 [c000000003be3de0] cpuidle_enter_state at c000000000edf244 #3 [c000000003be3e80] cpuidle_enter at c000000000b61ad0 #4 [c000000003be3ec0] call_cpuidle at c0000000001c5a84 #5 [c000000003be3ee0] do_idle at c0000000001cd9d0 #6 [c000000003be3f60] cpu_startup_entry at c0000000001cdd04 #7 [c000000003be3f90] start_secondary at c000000000064a34 #8 [c000000003be3fe0] start_secondary_prolog at c00000000000d158 Thanks, Aditya Gupta > > > > > > > It would be simple to get this fixed, however the more challenging > > > part, which is difficult for me, is how to get the inactive tasks > > > registers, So we can do similar things as in > > > x86_64.c:x86_64_get_stack_frame() of > > > https://github.com/liutgnu/crash-dev? > > > > > > I see there is a structure as task_struct -> thread_struct -> struct > > > pt_regs *regs. But sometimes the regs will give NULL value. > > > > True, the registers are NULL atleast in case of the swapper processes. > > It seems to me that, in order to support stack unwinding, we don't > need to get all registers, just nip & r1 (pc reg & stack pointer reg) > are enough for gdb unwinding. Anyway, could you please give the trial > patch a try, and see if it can work for you? > > Thanks, > Tao Liu > > [1]: https://github.com/liutgnu/crash-dev/commit/ef1cec3400bd7619ed9d3c9c4b50a0a613f95b55 > > > > > I will update on my findings. > > > > Thanks, > > Aditya Gupta > > > > > > > > Thanks, > > > Tao Liu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > Aditya Gupta > > > > > > > > > > > > > > Thanks, > > > > > Tao Liu > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Kazu > > > > > > > > > > > > On 2024/01/05 16:30, Aditya Gupta wrote: > > > > > > > The Problem: > > > > > > > ============ > > > > > > > > > > > > > > Currently crash is unable to show function arguments and local variables, as > > > > > > > gdb can do. And functionality for moving between frames ('up'/'down') is not > > > > > > > working in crash. > > > > > > > > > > > > > > Crash has 'gdb passthroughs' for things gdb can do, but the gdb passthroughs > > > > > > > 'bt', 'frame', 'info locals', 'up', 'down' are not working either, due to > > > > > > > gdb not getting the register values from `crash_target::fetch_registers`, > > > > > > > which then uses `machdep->get_cpu_reg`, which is not implemented for PPC64 > > > > > > > > > > > > > > Proposed Solution: > > > > > > > ================== > > > > > > > > > > > > > > Fix the gdb passthroughs by implementing "machdep->get_cpu_reg" for PPC64. > > > > > > > This way, "gdb mode in crash" will support this feature for both ELF and > > > > > > > kdump-compressed vmcore formats, while "gdb" would only have supported ELF > > > > > > > format > > > > > > > > > > > > > > This way other features of 'gdb', such as seeing > > > > > > > backtraces/registers/variables/arguments/local variables, moving up and > > > > > > > down stack frames, can be used with any ppc64 vmcore, irrespective of > > > > > > > being ELF format or kdump-compressed format. > > > > > > > > > > > > > > Note: This doesn't support live debugging on ppc64, since registers are not > > > > > > > available to be read > > > > > > > > > > > > > > Implications on Architectures: > > > > > > > ==================================== > > > > > > > > > > > > > > No architecture other than PPC64 has been affected, other than in case of > > > > > > > 'frame' command > > > > > > > > > > > > > > As mentioned in patch #2, since frame will not be prohibited, so it will print: > > > > > > > > > > > > > > crash> frame > > > > > > > #0 <unavailable> in ?? () > > > > > > > > > > > > > > Instead of before prohibited message: > > > > > > > > > > > > > > crash> frame > > > > > > > crash: prohibited gdb command: frame > > > > > > > > > > > > > > Major change will be in 'gdb mode' on PPC64, that it will print the frames, and > > > > > > > local variables, instead of failing with errors showing no frame, or showing > > > > > > > that couldn't get PC, it will be able to give all this information. > > > > > > > > > > > > > > Testing: > > > > > > > ======== > > > > > > > > > > > > > > Git tree with this patch series applied: > > > > > > > https://github.com/adi-g15-ibm/crash/tree/stack-unwind-v6 > > > > > > > > > > > > > > To test various gdb passthroughs: > > > > > > > > > > > > > > (crash) set > > > > > > > (crash) set gdb on > > > > > > > gdb> thread > > > > > > > gdb> bt > > > > > > > gdb> info threads > > > > > > > gdb> info threads > > > > > > > gdb> info locals > > > > > > > gdb> info variables irq_rover_lock > > > > > > > gdb> info args > > > > > > > gdb> thread 2 > > > > > > > gdb> set gdb off > > > > > > > (crash) set > > > > > > > (crash) set -c 6 > > > > > > > (crash) gdb thread > > > > > > > (crash) bt > > > > > > > (crash) gdb bt > > > > > > > (crash) frame > > > > > > > (crash) up > > > > > > > (crash) down > > > > > > > (crash) info locals > > > > > > > > > > > > > > Known Issues: > > > > > > > ============= > > > > > > > > > > > > > > 1. In gdb mode, 'bt' might fail to show backtrace in few vmcores collected > > > > > > > from older kernels. This is a known issue due to register mismatch, and > > > > > > > its fix has been merged upstream: > > > > > > > > > > > > > > This can also cause some 'invalid kernel virtual address' errors during gdb > > > > > > > unwinding the stack registers > > > > > > > > > > > > > > Commit: https://github.com/torvalds/linux/commit/b684c09f09e7a6af3794d4233ef785819e72db79 > > > > > > > > > > > > > > Fixing GDB passthroughs on other architectures > > > > > > > ============================================== > > > > > > > > > > > > > > Much of the work for making gdb passthroughs like 'gdb bt', 'gdb > > > > > > > thread', 'gdb info locals' etc. has been done by the patches introducing > > > > > > > 'machdep->get_cpu_reg' and this series fixing some issues in that. > > > > > > > > > > > > > > Other architectures should be able to fix these gdb functionalities by > > > > > > > simply implementing 'machdep->get_cpu_reg (cpu, regno, ...)'. > > > > > > > > > > > > > > The reasoning behind that has been explained with a diagram in commit > > > > > > > description of patch #1 > > > > > > > > > > > > > > I will assist with my findings/observations fixing it on ppc64 whenever needed. > > > > > > > > > > > > > > Changelog: > > > > > > > ========== > > > > > > > > > > > > > > V6: > > > > > > > + changes in patch #5: fix bug introduced in v5 that caused initial gdb thread > > > > > > > to be thread 1 > > > > > > > > > > > > > > V5: > > > > > > > + changes in patch #1: made ppc64_get_cpu_reg static, and remove unreachable > > > > > > > code > > > > > > > + changes in patch #3: fixed typo 'ppc64_renum' instead of 'ppc64_regnum', > > > > > > > remove unneeded if condition > > > > > > > + changes in patch #5: implement refresh regcache on per thread, instead of all > > > > > > > threads at once > > > > > > > > > > > > > > V4: > > > > > > > + fix segmentation fault in live debugging (change in patch #1) > > > > > > > + mention live debugging not supported in cover letter and patch #1 > > > > > > > + fixed some checkpatch warnings (change in patch #5) > > > > > > > > > > > > > > V3: > > > > > > > + default gdb thread will be the crashing thread, instead of being > > > > > > > thread '0' > > > > > > > + synchronise crash cpu and gdb thread context > > > > > > > + fix bug in gdb_interface, that replaced gdb's output stream, losing > > > > > > > output in some cases, such as info threads and extra output in info > > > > > > > variables > > > > > > > + fix 'info threads' > > > > > > > > > > > > > > RFC V2: > > > > > > > - removed patch implementing 'frame', 'up', 'down' in crash > > > > > > > - updated the cover letter by removing the mention of those commands other > > > > > > > than the respective gdb passthrough > > > > > > > > > > > > > > Aditya Gupta (5): > > > > > > > ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg > > > > > > > remove 'frame' from prohibited commands list > > > > > > > synchronise cpu context changes between crash/gdb > > > > > > > fix gdb_interface: restore gdb's output streams at end of > > > > > > > gdb_interface > > > > > > > fix 'info threads' command > > > > > > > > > > > > > > crash_target.c | 44 ++++++++++++++++ > > > > > > > defs.h | 130 +++++++++++++++++++++++++++++++++++++++++++++++- > > > > > > > gdb-10.2.patch | 110 +++++++++++++++++++++++++++++++++++++++- > > > > > > > gdb_interface.c | 2 +- > > > > > > > kernel.c | 47 +++++++++++++++-- > > > > > > > ppc64.c | 95 +++++++++++++++++++++++++++++++++-- > > > > > > > task.c | 14 ++++++ > > > > > > > tools.c | 2 +- > > > > > > > 8 files changed, 434 insertions(+), 10 deletions(-) > > > > > > > > > > > > > > > > > > > > > > -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki