Re: Incorrect vmcoreinfo KERNELOFFSET after "s390/boot: Rework deployment of the kernel image"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 10, 2024 at 07:02:46AM +0200, Petr Tesařík wrote:
> On Tue, 9 Jul 2024 13:55:58 -0700
> Omar Sandoval <osandov@xxxxxxxxxxx> wrote:
> 
> > On Tue, Jul 09, 2024 at 09:21:41PM +0200, Petr Tesařík wrote:
> > > On Thu, 27 Jun 2024 15:51:09 +0200
> > > Alexander Gordeev <agordeev@xxxxxxxxxxxxx> wrote:
> > >   
> > > > On Thu, Jun 20, 2024 at 04:34:15PM -0700, Omar Sandoval wrote:
> > > > 
> > > > Hi Omar,
> > > >   
> > > > > Hi, Alexander and Sven,
> > > > > 
> > > > > I just got around to testing drgn on s390x on 6.10-rc4, and it appears
> > > > > to be broken. I bisected it to commit 56b1069c40c7 ("s390/boot: Rework
> > > > > deployment of the kernel image") and narrowed it down to an issue with
> > > > > the KERNELOFFSET value reported in vmcoreinfo.
> > > > > 
> > > > > On my test kernel, the ELF symbol for init_task is 0xc96f00:
> > > > > 
> > > > >   $ eu-readelf -s build/vmtest/s390x/kernel-6.10.0-rc4-vmtest30.1default/build/vmlinux | grep ' init_task$'
> > > > >   72273: 0000000000c96f00   4352 OBJECT  GLOBAL DEFAULT       18 init_task
> > > > > 
> > > > > And the address in the loaded kernel is 0x3ffffeaaf00:
> > > > > 
> > > > >   # grep ' init_task$' /proc/kallsyms
> > > > >   000003ffffeaaf00 D init_task
> > > > > 
> > > > > 0x3ffffeaaf00 - 0xc96f00 is 0x3ffff214000
> > > > > 
> > > > > However, this doesn't match the value of KERNELOFFSET in vmcoreinfo:
> > > > > 
> > > > >   # eu-readelf -n /proc/kcore | grep KERNELOFFSET
> > > > >     KERNELOFFSET=3ffff314000
> > > > > 
> > > > > It's off by 0x100000. This causes drgn to compute the wrong addresses
> > > > > for all global variables.
> > > > > 
> > > > > For context, I'm testing using QEMU emulation on an x86-64 host. Note
> > > > > that it logs "KASLR disabled: CPU has no PRNG" early during boot. My
> > > > > exact setup is:
> > > > > 
> > > > >   $ git clone https://github.com/osandov/drgn.git
> > > > >   $ cd drgn
> > > > >   $ python3 -m vmtest.rootfsbuild -a s390x --build-drgn
> > > > >   $ python3 -m vmtest.vm -k 's390x:6.10.*' bash -i
> > > > >   # python3 -m drgn    
> > > > >   >>> prog['init_task'].comm    
> > > > >   (char [16])""
> > > > > 
> > > > > That should be printing "swapper/0".
> > > > > 
> > > > > Any ideas what's going on here?    
> > > > 
> > > > On s390 no kernel symbol exists below 0x100000 offset within the
> > > > vmlinux image and thus this part is never mapped into the kernel
> > > > memory. That way KERNELOFFSET turns out to be off on value of
> > > > 0x100000 - and that is what you observe.
> > > > 
> > > > That breaks the way drgn finds a kernel symbol, but does not
> > > > exactly contradicts to the existing KERNELOFFSET description
> > > > (Documentation/admin-guide/kdump/vmcoreinfo.rst):
> > > > 
> > > > ===
> > > > KERNELOFFSET
> > > > ------------
> > > > 
> > > > The kernel randomization offset. Used to compute the page offset. If
> > > > KASLR is disabled, this value is zero.
> > > > ===
> > > > 
> > > > I would say to some degree there is also inconsisten with regard
> > > > to /proc/ files existence:
> > > > /proc/kcore    is enabled by CONFIG_PROC_KCORE option, while
> > > > /proc/kallsyms is enabled by CONFIG_KALLSYMS option.
> > > > I assume drgn expects both files exist and does not work otherwise.  
> > 
> > drgn doesn't use kallsyms, partially because it's not guaranteed to
> > exist as you pointed out, and partially because it's slow.
> > 
> > > > Nevertheless, it is still possible to refer to only one file for
> > > > symbol resolution and use an always-present symbol. E.g _stext
> > > > could be leveraged like this:
> > > > 
> > > > # grep -w init_task /proc/kallsyms
> > > > 000003ffe13e9400 D init_task
> > > > # grep -w _stext /proc/kallsyms
> > > > 000003ffe0000000 T _stext
> > > > 
> > > > 0x3ffe13e9400 - 0x3ffe0000000 == 0x13e9400
> > > > 
> > > > # eu-readelf -s vmlinux | grep -w _stext
> > > > 178112: 0000000000100000      0 NOTYPE  GLOBAL DEFAULT        1 _stext
> > > > 
> > > > 0x13e9400 + 0x100000 == 0x14e9400
> > > > 
> > > > # eu-readelf -s vmlinux | grep -w init_task
> > > >   498: 0000000000000000      0 FILE    LOCAL  DEFAULT      ABS init_task.c
> > > > 182344: 00000000014e9400   8960 OBJECT  GLOBAL DEFAULT       28 init_task
> > > > 
> > > > I guess, the above holds true for all architectures.
> > > > If so, I would suggest consider using that approach.
> > > > 
> > > > Having said that, we will try to turn KERNELOFFSET from a synthetic
> > > > value "Used to compute the page offset" to what drgn expects it to be.  
> > > 
> > > Thinking about it now, I'm not sure it makes life easier. Because then
> > > we'll have some old kernels with the current (unexpected) definition of
> > > KERNELOFFSET and some new kernels with a more standard definition of
> > > it, but if I read vmcoreinfo, how do I know if the value has the old or
> > > the new meaning?  
> > 
> > I wasn't able to get real KASLR working on my s390x VM, but what I found
> > in testing without KASLR was:
> > 
> > - Before commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual
> >   address spaces"), KERNELOFFSET was not set at all (this is expected).
> > - After commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual
> >   address spaces"), but before commit 56b1069c40c7 ("s390/boot: Rework
> >   deployment of the kernel image"), KERNELOFFSET was set in a way that
> >   drgn understands even without KASLR (that's a little odd but fine with
> >   me).
> > - After commit 56b1069c40c7 ("s390/boot: Rework deployment of the kernel
> >   image"), KERNELOFFSET was set "incorrectly"
> > 
> > So at least for no KASLR, the breakage has been limited only to the 6.10
> > rcs, which isn't too late to fix. I'd be curious what the behavior was
> > with KASLR before 6.10, though.
> 
> OK, I'll check SLES 15 SP5 (kernel 5.14) and SP6 (kernel 6.4). Both
> enable KASLR, but it can be turned off on the command line (or I can
> even rebuild the kernel without CONFIG_RANDOMIZE_BASE if that makes a
> difference).

The case I'm interested here is with KASLR enabled. In those kernel
versions, is KERNELOFFSET the difference between the addresses in
vmlinux and the actual addresses in memory?

Thanks,
Omar




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux