On 2024/04/18 14:55, HAGIO KAZUHITO(萩尾 一仁) wrote: > Hi Lianbo, > > Thank you for the fix. > > I'm afraid of merging this kind of patch changing the initial value in > gdb before we will make the new release soon, but as far as I tested > there was no issue found. so let's merge this, but.. > > On 2024/03/06 21:32, lijiang wrote: >> On Wed, Mar 6, 2024 at 6:30 PM Daisuke Hatayama (Fujitsu) >> <d.hatayama@xxxxxxxxxxx <mailto:d.hatayama@xxxxxxxxxxx>> wrote: >> >> Lianbo, >> >> Thank you for your work. >> >> > Some objects format may potentially support copy relocations, but >> > currently the maybe_copied is always initialized to 0 in the >> symbol(). >> > And the type is 'mst_file_bss', not always the 'mst_bss' or >> 'mst_data' >> > in the lookup_minimal_symbol_linkage(). For example: >> > >> > (gdb) p *msymbol >> > $42 = {<general_symbol_info> = {m_name = 0x349812f >> "test_no_static", value = {ivalue = 8, block = 0x8, >> > bytes = 0x8 <error: Cannot access memory at address 0x8>, >> address = 8, common_block = 0x8, chain = 0x8}, language_specific = { >> > obstack = 0x0, demangled_name = 0x0}, m_language = >> language_auto, ada_mangled = 0, section = 20}, size = 4, >> > filename = 0x6db3440 "test_sanity.c", type = mst_file_bss, >> created_by_gdb = 0, target_flag_1 = 0, target_flag_2 = 0, has_size = 1, >> > maybe_copied = 0, name_set = 1, hash_next = 0x0, >> demangled_hash_next = 0x0} >> >> The current description lacks explanation of when this issue >> occurs. Please write that the issue occurs when the corresponding >> kernel is built with CONFIG_CALL_DEPTH_TRACKING=y. >> >> >> Thank you for the comment, Hatayama. >> >> I should describe more background on this issue in the patch log. The >> current issue can be easily reproduced with the following kernel commit: >> >> commit 80e4c1cd42fff110bfdae8fce7ac4f22465f9664 (HEAD) >> Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx <mailto:tglx@xxxxxxxxxxxxx>> >> Date: Thu Sep 15 13:11:19 2022 +0200 >> >> x86/retbleed: Add X86_FEATURE_CALL_DEPTH >> >> Intel SKL CPUs fall back to other predictors when the RSB >> underflows. The >> only microcode mitigation is IBRS which is insanely expensive. It comes >> with performance drops of up to 30% depending on the workload. >> >> A way less expensive, but nevertheless horrible mitigation is to >> track the >> call depth in software and overeagerly fill the RSB when returns >> underflow >> the software counter. >> >> Provide a configuration symbol and a CPU misfeature bit. >> >> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx >> <mailto:tglx@xxxxxxxxxxxxx>> >> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx >> <mailto:peterz@xxxxxxxxxxxxx>> >> Link: >> https://lore.kernel.org/r/20220915111147.056176424@xxxxxxxxxxxxx >> <https://lore.kernel.org/r/20220915111147.056176424@xxxxxxxxxxxxx> >> >> After reverting the above commit, the current issue may disappear. And >> originally I tried to find the clue how this kernel commit changes >> affected the gdb, I have not found the clue for the time being. But >> later I noticed that the gdb gets the correct offset address of a global >> variable 'test_no_static', which is an expected behavior from the gdb >> perspective because of copy relocations, probably some object files >> potentially support the copy relocations, just like this. >> >> It would also be good to describe the fact that the issue occurs at >> least on RHEL9 kernel. >> >> >> This is an upstream issue, I have reproduced it on the upstream kernel >> with the above kernel commit changes. > > The information that the issue occurs on RHEL9 kernel is helpful to RHEL > users. Why don't you add something like this? > --- > The issue occurs with Linux 6.2 and later or kernels that have kernel > commit 80e4c1cd42ff ("x86/retbleed: Add X86_FEATURE_CALL_DEPTH") and > configured with CONFIG_CALL_DEPTH_TRACKING=y, including RHEL9.3 and > later kernels. > --- > >> >> >> > This causes a problem that the 'p' command can not work well as >> > expected, and always gets an error: >> > >> > crash> mod -s test_sanity /home/test_sanity.ko >> > MODULE NAME BASE >> SIZE OBJECT FILE >> > ffffffffc1084040 test_sanity ffffffffc1082000 >> 16384 /home/test_sanity.ko >> > crash> p test_no_static >> > p: gdb request failed: p test_no_static >> > crash> >> > >> > With the patch: >> > crash> mod -s test_sanity /home/test_sanity.ko >> > MODULE NAME BASE >> SIZE OBJECT FILE >> > ffffffffc1084040 test_sanity ffffffffc1082000 >> 16384 /home/test_sanity.ko >> > crash> p test_no_static >> > test_no_static = $1 = 5 >> > crash> >> >> It's correct that p command doesn't work as expected, but it doesn't >> always result in some error. This issue is failure of calculating >> relocated address of static symbols. If the calculated address happens >> to be the address where read can be successfull, it doesn't result in >> read error but outputs some bogus value. >> >> >> It's true, but the bogus value is not an expected result because of an >> incorrect address. > > I guess he means that the phrase only "always gets an error" is not > true, because it can emit a bogus value. so isn't "emits an error or a > bogus value" ok in this case? > > > I've rebased the patch and modified the commit log as my comments above > and attached. Lianbo, is this OK? as talked internally, Lianbo had no further comments, so applied it. https://github.com/crash-utility/crash/commit/eedf12d4758409c3c405f56edf3177a3955e1f67 Thanks, Kazu -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki