[Crash-utility] arm64: Fix bt command show wrong stacktrace on ramdump source

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,
I may found a potential bug when using qcom arm64 ramdump to parse backtrace.
Unfortunately, I actually found no processes can use the bt command correctly.

Ex: when start crash tool to do analyse: # crash vmlinux --kaslr=xxx DDRCS0_0.BIN@0x0000000080000000,... --machdep vabits_actual=39

Then seen below misleading backtrace information :  

crash> bt 16930

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

 #1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0

 #2 [ffffffc034c438b0] __kvm_nvhe_$d.2314 at 86c54c6004ceff80

 #3 [ffffffc034c43950] __kvm_nvhe_$d.2314 at 55d6f96003a7b120

 #4 [ffffffc034c439f0] __kvm_nvhe_$d.2314 at 9ccec46003a80a64

 #5 [ffffffc034c43ac0] __kvm_nvhe_$d.2314 at 8cf41e6003a945c4

 #6 [ffffffc034c43b10] __kvm_nvhe_$d.2314 at a8f181e00372c818

 #7 [ffffffc034c43b40] __kvm_nvhe_$d.2314 at 6dedde600372c0d0

 #8 [ffffffc034c43b90] __kvm_nvhe_$d.2314 at 62cc07e00373d0ac

 #9 [ffffffc034c43c00] __kvm_nvhe_$d.2314 at 72fb1de00373bedc

...

     PC: 00000073f5294840   LR: 00000070d8f39ba4   SP: 00000070d4afd5d0

    X29: 00000070d4afd600  X28: b4000071efcda7f0  X27: 00000070d4afe000

    X26: 0000000000000000  X25: 00000070d9616000  X24: 0000000000000000

    X23: 0000000000000000  X22: 0000000000000000  X21: 0000000000000000

    X20: b40000728fd27520  X19: b40000728fd27550  X18: 000000702daba000

    X17: 00000073f5294820  X16: 00000070d940f9d8  X15: 00000000000000bf

    X14: 0000000000000000  X13: 00000070d8ad2fac  X12: b40000718fce5040

    X11: 0000000000000000  X10: 0000000000000070   X9: 0000000000000001

     X8: 0000000000000062   X7: 0000000000000020   X6: 0000000000000000

     X5: 0000000000000000   X4: 0000000000000000   X3: 0000000000000000

     X2: 0000000000000002   X1: 0000000000000080   X0: b40000728fd27550

    ORIG_X0: b40000728fd27550  SYSCALLNO: ffffffff  PSTATE: 40001000


By checking the raw data below, will see the lr (fp+8) data show the pointer which already been replaced by PAC prefix.


crash> bt -f

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

    ffffffc034c437f0: ffffffc034c43850 6be732e004cf05a4

    ffffffc034c43800: ffffffe006186108 a0ed07e004cf09c4

    ffffffc034c43810: ffffff8a1a340000 ffffff8a8d343c00

    ffffffc034c43820: ffffff89b3eada00 ffffff8b780db540

    ffffffc034c43830: ffffff89b3eada00 0000000000000000

    ffffffc034c43840: 0000000000000004 712b828118484a00

 #1 [ffffffc034c43850] __kvm_nvhe_$d.2314 at 6be732e004cf05a0

    ffffffc034c43850: ffffffc034c438b0 86c54c6004ceff84

    ffffffc034c43860: 000000708070f000 ffffffc034c43938

    ffffffc034c43870: ffffff88bd822878 ffffff89b3eada00

...


So we check the CONFIG_ARM64_PTR_AUTH and CONFIG_ARM64_PTR_AUTH_KERNEL to double check if pac mechanism been enabled on this ramdump.

Then we use vabits to figure it out.

Fix then show the right backtrace below:

crash> bt 16930

PID: 16930    TASK: ffffff89b3eada00  CPU: 2    COMMAND: "Firebase Backgr"

 #0 [ffffffc034c437f0] __switch_to at ffffffe0036832d4

 #1 [ffffffc034c43850] __schedule at ffffffe004cf05a0

 #2 [ffffffc034c438b0] preempt_schedule_common at ffffffe004ceff80

 #3 [ffffffc034c43950] unmap_page_range at ffffffe003a7b120

 #4 [ffffffc034c439f0] unmap_vmas at ffffffe003a80a64

 #5 [ffffffc034c43ac0] exit_mmap at ffffffe003a945c4

 #6 [ffffffc034c43b10] __mmput at ffffffe00372c818

 #7 [ffffffc034c43b40] mmput at ffffffe00372c0d0

 #8 [ffffffc034c43b90] exit_mm at ffffffe00373d0ac

 #9 [ffffffc034c43c00] do_exit at ffffffe00373bedc

     PC: 00000073f5294840   LR: 00000070d8f39ba4   SP: 00000070d4afd5d0

    X29: 00000070d4afd600  X28: b4000071efcda7f0  X27: 00000070d4afe000

    X26: 0000000000000000  X25: 00000070d9616000  X24: 0000000000000000

    X23: 0000000000000000  X22: 0000000000000000  X21: 0000000000000000

    X20: b40000728fd27520  X19: b40000728fd27550  X18: 000000702daba000

    X17: 00000073f5294820  X16: 00000070d940f9d8  X15: 00000000000000bf

    X14: 0000000000000000  X13: 00000070d8ad2fac  X12: b40000718fce5040

    X11: 0000000000000000  X10: 0000000000000070   X9: 0000000000000001

     X8: 0000000000000062   X7: 0000000000000020   X6: 0000000000000000

     X5: 0000000000000000   X4: 0000000000000000   X3: 0000000000000000

     X2: 0000000000000002   X1: 0000000000000080   X0: b40000728fd27550

    ORIG_X0: b40000728fd27550  SYSCALLNO: ffffffff  PSTATE: 40001000


Let's use GENMASK to replace the pac pointer to fix it.

gki related commit url here:

https://lore.kernel.org/all/20230412160134.306148-4-mark.rutland@xxxxxxx/

Attachment: 0001-arm64-Fix-bt-command-show-wrong-stacktrace-on-ramdum.patch
Description: Binary data

--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki

[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux