[Crash-utility] Re: [PATCH] gdb: fix the "p" command incorrectly print the value of a global variable

HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx> · Mon, 22 Apr 2024 05:26:21 +0000

On 2024/04/18 14:55, HAGIO KAZUHITO(萩尾 一仁) wrote:
> Hi Lianbo,
> 
> Thank you for the fix.
> 
> I'm afraid of merging this kind of patch changing the initial value in
> gdb before we will make the new release soon, but as far as I tested
> there was no issue found.  so let's merge this, but..
> 
> On 2024/03/06 21:32, lijiang wrote:
>> On Wed, Mar 6, 2024 at 6:30 PM Daisuke Hatayama (Fujitsu)
>> <d.hatayama@xxxxxxxxxxx <mailto:d.hatayama@xxxxxxxxxxx>> wrote:
>>
>>      Lianbo,
>>
>>      Thank you for your work.
>>
>>       > Some objects format may potentially support copy relocations, but
>>       > currently the maybe_copied is always initialized to 0 in the
>>      symbol().
>>       > And the type is 'mst_file_bss', not always the 'mst_bss' or
>>      'mst_data'
>>       > in the lookup_minimal_symbol_linkage(). For example:
>>       >
>>       > (gdb) p *msymbol
>>       > $42 = {<general_symbol_info> = {m_name = 0x349812f
>>      "test_no_static", value = {ivalue = 8, block = 0x8,
>>       >       bytes = 0x8 <error: Cannot access memory at address 0x8>,
>>      address = 8, common_block = 0x8, chain = 0x8}, language_specific = {
>>       >       obstack = 0x0, demangled_name = 0x0}, m_language =
>>      language_auto, ada_mangled = 0, section = 20}, size = 4,
>>       >   filename = 0x6db3440 "test_sanity.c", type = mst_file_bss,
>>      created_by_gdb = 0, target_flag_1 = 0, target_flag_2 = 0, has_size = 1,
>>       >   maybe_copied = 0, name_set = 1, hash_next = 0x0,
>>      demangled_hash_next = 0x0}
>>
>>      The current description lacks explanation of when this issue
>>      occurs. Please write that the issue occurs when the corresponding
>>      kernel is built with CONFIG_CALL_DEPTH_TRACKING=y.
>>
>>
>> Thank you for the comment, Hatayama.
>>
>> I should describe more background on this issue in the patch log. The
>> current issue can be easily reproduced with the following kernel commit:
>>
>> commit 80e4c1cd42fff110bfdae8fce7ac4f22465f9664 (HEAD)
>> Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx <mailto:tglx@xxxxxxxxxxxxx>>
>> Date:   Thu Sep 15 13:11:19 2022 +0200
>>
>>       x86/retbleed: Add X86_FEATURE_CALL_DEPTH
>>
>>       Intel SKL CPUs fall back to other predictors when the RSB
>> underflows. The
>>       only microcode mitigation is IBRS which is insanely expensive. It comes
>>       with performance drops of up to 30% depending on the workload.
>>
>>       A way less expensive, but nevertheless horrible mitigation is to
>> track the
>>       call depth in software and overeagerly fill the RSB when returns
>> underflow
>>       the software counter.
>>
>>       Provide a configuration symbol and a CPU misfeature bit.
>>
>>       Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx
>> <mailto:tglx@xxxxxxxxxxxxx>>
>>       Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx
>> <mailto:peterz@xxxxxxxxxxxxx>>
>>       Link:
>> https://lore.kernel.org/r/20220915111147.056176424@xxxxxxxxxxxxx
>> <https://lore.kernel.org/r/20220915111147.056176424@xxxxxxxxxxxxx>
>>
>> After reverting the above commit, the current issue may disappear. And
>> originally I tried to find the clue how this kernel commit changes
>> affected the gdb, I have not found the clue for the time being. But
>> later I noticed that the gdb gets the correct offset address of a global
>> variable 'test_no_static', which is an expected behavior from the gdb
>> perspective because of copy relocations, probably some object files
>> potentially support the copy relocations, just like this.
>>
>>      It would also be good to describe the fact that the issue occurs at
>>      least on RHEL9 kernel.
>>
>>
>> This is an upstream issue, I have reproduced it on the upstream kernel
>> with the above kernel commit changes.
> 
> The information that the issue occurs on RHEL9 kernel is helpful to RHEL
> users.  Why don't you add something like this?
> ---
> The issue occurs with Linux 6.2 and later or kernels that have kernel
> commit 80e4c1cd42ff ("x86/retbleed: Add X86_FEATURE_CALL_DEPTH") and
> configured with CONFIG_CALL_DEPTH_TRACKING=y, including RHEL9.3 and
> later kernels.
> ---
> 
>>
>>
>>       > This causes a problem that the 'p' command can not work well as
>>       > expected, and always gets an error:
>>       >
>>       >   crash> mod -s test_sanity /home/test_sanity.ko
>>       >        MODULE       NAME                         BASE
>>        SIZE  OBJECT FILE
>>       >   ffffffffc1084040  test_sanity            ffffffffc1082000
>>      16384  /home/test_sanity.ko
>>       >   crash> p test_no_static
>>       >   p: gdb request failed: p test_no_static
>>       >   crash>
>>       >
>>       > With the patch:
>>       >   crash> mod -s test_sanity /home/test_sanity.ko
>>       >        MODULE       NAME                         BASE
>>        SIZE  OBJECT FILE
>>       >   ffffffffc1084040  test_sanity            ffffffffc1082000
>>      16384  /home/test_sanity.ko
>>       >   crash> p test_no_static
>>       >   test_no_static = $1 = 5
>>       >   crash>
>>
>>      It's correct that p command doesn't work as expected, but it doesn't
>>      always result in some error. This issue is failure of calculating
>>      relocated address of static symbols. If the calculated address happens
>>      to be the address where read can be successfull, it doesn't result in
>>      read error but outputs some bogus value.
>>
>>
>> It's true, but the bogus value is not an expected result because of an
>> incorrect address.
> 
> I guess he means that the phrase only "always gets an error" is not
> true, because it can emit a bogus value.  so isn't "emits an error or a
> bogus value" ok in this case?
> 
> 
> I've rebased the patch and modified the commit log as my comments above
> and attached.  Lianbo, is this OK?

as talked internally, Lianbo had no further comments, so applied it.

https://github.com/crash-utility/crash/commit/eedf12d4758409c3c405f56edf3177a3955e1f67

Thanks,
Kazu
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki