Re: Are all linux_banner type cases covered in verify_version()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/20/23 19:37, HAGIO KAZUHITO(萩尾 一仁) wrote:
On 2023/08/18 3:08, David Mair wrote:
Hi All,

Before I consider starting work on a patch for this I'd appreciate more
input.

I am seeing random cases of crash failing to load reporting an x86_64
coredump reporting a "bad" linux_banner. However, the value displayed as
the banner is:

0x65762078756e694c

which is plainly ASCII text as a 64-bit number and is the little-endian
reversal of the text "Linux ver".

It's randomly found with specific coredumps and reproduces all times
with that coredump and a given version of crash, though sometimes it
will appear when using a given coredump with one version of crash but
not with another version of crash. What I'm trying to get working is
crash current and the rest of this is experience using crash 8.0.3 only.

I used gdb crash to debug it through verify_version(). If I breakpoint
there with gdb crash and step through the function I find that in the
section:


      if (!(sp = symbol_search("linux_banner")))
          error(FATAL, "linux_banner symbol does not exist?\n");
      else if ((sp->type == 'R') || (sp->type == 'r') ||
          (THIS_KERNEL_VERSION >= LINUX(2,6,11) && (sp->type == 'D' ||
sp->type == 'd')) ||
           (machine_type("ARM") && sp->type == 'T') ||
           (machine_type("ARM64")))
          linux_banner = symbol_value("linux_banner");
      else
          get_symbol_data("linux_banner", sizeof(ulong), &linux_banner);


* The if block is not executed, i.e. symbol_search("linux_banner")
succeeded and we have a usable struct syment for "linux_banner" in sp
* The else if block is not executed, all conditions are met or not
relevant except for the the value of sp->type in the case of
THIS_KERNEL_VERSION >= LINUX(2,6,11). But sp->type is 'B', bss segment
* The final else block is executed, we copy sizeof(ulong) bytes of
symbol data from what "linux_banner" refers to into the crash internal
linux_banner variable

Here's how sp looks at the else if statement in the above code:

gdb) print *sp
$2 = {value = 18446744071587233984, name = 0x5555566a735b "linux_banner",
    val_hash_next = 0x7fffe51e4338, name_hash_next = 0x7fffe51f8d38,
    type = 66 'B', cnt = 1 '\001', flags = 0 '\000', pad2 = 0 '\000'}

...and sp->value in hex is:

(gdb) p/x sp->value
$5 = 0xffffffff818000c0

Starting crash in --minimal mode with the same core, kernel and
debuginfo so that I can try to read 0xffffffff818000c0 I find:

crash> rd 0xffffffff818000c0 18
ffffffff818000c0:  65762078756e694c 2e34206e6f697372   Linux version 4.
ffffffff818000d0:  34392d3038312e34 6665642d3533312e   4.180-94.135-def
ffffffff818000e0:  65672820746c7561 6c697562406f6b65   ault (geeko@buil
ffffffff818000f0:  28202974736f6864 7372657620636367   dhost) (gcc vers
ffffffff81800100:  2e382e34206e6f69 2045535553282035   ion 4.8.5 (SUSE
ffffffff81800110:  29202978756e694c 20504d5320312320   Linux) ) #1 SMP
ffffffff81800120:  20766f4e206e6f4d 35353a3930203631   Mon Nov 16 09:55
ffffffff81800130:  204354552037353a 3033282030323032   :57 UTC 2020 (30
ffffffff81800140:  000a293039393633 0000000000000000   36990)..........

IOW, it is the linux_banner, it's at 0xffffffff818000c0 and if the first
sizeof(ulong) bytes are read into crash's linux_banner variable via
get_symbol_data() it makes a 64-bit number with little-endian revesral
from the string bytes "Linux ver" from the actual linux_banner text, the
same value seen in the error report when crash fails.

If I repeat the above debug of verify_version() in gdb and at the else
if block in the above C in verify_version() I set var the sp->type to be
'D' or 'd' then crash's linux_banner is set to 0xffffffff818000c0
(sp->value) and the else if block is executed then the remainder of the
function successfully finds a linux_banner and gets the version
information from it and crash loads.

In symbol_search() is the value of type in the returned struct syment a
property generated by crash (I thought not and it was based on the
kernel compilation, possible kdummp/makedumpfile but not controlled by
crash)? If I'm correct then is it safe to expect a struct syment with
type 'B' (also 'b' I believe) for linux_banner? Or is there anything
special about a bss segment type that makes it not possible to assume we
can take sp->value as-is for the address of the linux_banner string
details, i.e. we can't safely use symbol_value("linux_banner") to set
the crash linux_banner variable if the struct syment type for
linux_banner is 'B' or 'b'?

Any comments? I'll write a patch if that's the right direction but I
need a better understanding of why symbols with the type bss segment
aren't already assumed valid sources of a linux_banner address value,
only 'D' and 'd' types are and if there is anything special I don't
understand about 'B' and 'b' types.


It seems that sp->type is set to a value from bfd_get_symbol_info() in
store_symbols().  So it's from the vmlinux through the embedded gdb.

Anyway, I don't think that the current check gets the point, i.e. we can
check the type of the linux_banner symbol directly like this?

    switch (get_symbol_type("linux_banner", NULL, NULL))
    {
        case TYPE_CODE_ARRAY:
            linux_banner = sp->value;
            break;
        case TYPE_CODE_PTR:
            get_symbol_data("linux_banner", sizeof(ulong), &linux_banner);
            break;
        default:
            error(WARNING, "linux_banner is unknown type\n");
            linux_banner = 0;
            break;
    }

I don't know why it checks the sp->type, but what it should do
originally might be the above, I think.

Lianbo, do you have an old vmcore that has "const char *linux_banner"?
According to crash commit fce91bec5bef, 2.6.10 and older kernels have
it.  I'd like to test the above with it.

Kazu,

I've seen the above code solve cases of starting crash giving "bad" linux_banner for three coredumps in a week. I don't want to be a reason to hold things back and the variety of coredumps is (well, it is only three) but random enough for me to expect the change covers a broad set of dumps.

It's your change, do you want to make the commit? I can prepare one if you prefer.

--
David

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux