On Tue, Jul 09, 2024 at 09:21:41PM +0200, Petr Tesařík wrote: > On Thu, 27 Jun 2024 15:51:09 +0200 > Alexander Gordeev <agordeev@xxxxxxxxxxxxx> wrote: > > > On Thu, Jun 20, 2024 at 04:34:15PM -0700, Omar Sandoval wrote: > > > > Hi Omar, > > > > > Hi, Alexander and Sven, > > > > > > I just got around to testing drgn on s390x on 6.10-rc4, and it appears > > > to be broken. I bisected it to commit 56b1069c40c7 ("s390/boot: Rework > > > deployment of the kernel image") and narrowed it down to an issue with > > > the KERNELOFFSET value reported in vmcoreinfo. > > > > > > On my test kernel, the ELF symbol for init_task is 0xc96f00: > > > > > > $ eu-readelf -s build/vmtest/s390x/kernel-6.10.0-rc4-vmtest30.1default/build/vmlinux | grep ' init_task$' > > > 72273: 0000000000c96f00 4352 OBJECT GLOBAL DEFAULT 18 init_task > > > > > > And the address in the loaded kernel is 0x3ffffeaaf00: > > > > > > # grep ' init_task$' /proc/kallsyms > > > 000003ffffeaaf00 D init_task > > > > > > 0x3ffffeaaf00 - 0xc96f00 is 0x3ffff214000 > > > > > > However, this doesn't match the value of KERNELOFFSET in vmcoreinfo: > > > > > > # eu-readelf -n /proc/kcore | grep KERNELOFFSET > > > KERNELOFFSET=3ffff314000 > > > > > > It's off by 0x100000. This causes drgn to compute the wrong addresses > > > for all global variables. > > > > > > For context, I'm testing using QEMU emulation on an x86-64 host. Note > > > that it logs "KASLR disabled: CPU has no PRNG" early during boot. My > > > exact setup is: > > > > > > $ git clone https://github.com/osandov/drgn.git > > > $ cd drgn > > > $ python3 -m vmtest.rootfsbuild -a s390x --build-drgn > > > $ python3 -m vmtest.vm -k 's390x:6.10.*' bash -i > > > # python3 -m drgn > > > >>> prog['init_task'].comm > > > (char [16])"" > > > > > > That should be printing "swapper/0". > > > > > > Any ideas what's going on here? > > > > On s390 no kernel symbol exists below 0x100000 offset within the > > vmlinux image and thus this part is never mapped into the kernel > > memory. That way KERNELOFFSET turns out to be off on value of > > 0x100000 - and that is what you observe. > > > > That breaks the way drgn finds a kernel symbol, but does not > > exactly contradicts to the existing KERNELOFFSET description > > (Documentation/admin-guide/kdump/vmcoreinfo.rst): > > > > === > > KERNELOFFSET > > ------------ > > > > The kernel randomization offset. Used to compute the page offset. If > > KASLR is disabled, this value is zero. > > === > > > > I would say to some degree there is also inconsisten with regard > > to /proc/ files existence: > > /proc/kcore is enabled by CONFIG_PROC_KCORE option, while > > /proc/kallsyms is enabled by CONFIG_KALLSYMS option. > > I assume drgn expects both files exist and does not work otherwise. drgn doesn't use kallsyms, partially because it's not guaranteed to exist as you pointed out, and partially because it's slow. > > Nevertheless, it is still possible to refer to only one file for > > symbol resolution and use an always-present symbol. E.g _stext > > could be leveraged like this: > > > > # grep -w init_task /proc/kallsyms > > 000003ffe13e9400 D init_task > > # grep -w _stext /proc/kallsyms > > 000003ffe0000000 T _stext > > > > 0x3ffe13e9400 - 0x3ffe0000000 == 0x13e9400 > > > > # eu-readelf -s vmlinux | grep -w _stext > > 178112: 0000000000100000 0 NOTYPE GLOBAL DEFAULT 1 _stext > > > > 0x13e9400 + 0x100000 == 0x14e9400 > > > > # eu-readelf -s vmlinux | grep -w init_task > > 498: 0000000000000000 0 FILE LOCAL DEFAULT ABS init_task.c > > 182344: 00000000014e9400 8960 OBJECT GLOBAL DEFAULT 28 init_task > > > > I guess, the above holds true for all architectures. > > If so, I would suggest consider using that approach. > > > > Having said that, we will try to turn KERNELOFFSET from a synthetic > > value "Used to compute the page offset" to what drgn expects it to be. > > Thinking about it now, I'm not sure it makes life easier. Because then > we'll have some old kernels with the current (unexpected) definition of > KERNELOFFSET and some new kernels with a more standard definition of > it, but if I read vmcoreinfo, how do I know if the value has the old or > the new meaning? I wasn't able to get real KASLR working on my s390x VM, but what I found in testing without KASLR was: - Before commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces"), KERNELOFFSET was not set at all (this is expected). - After commit c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces"), but before commit 56b1069c40c7 ("s390/boot: Rework deployment of the kernel image"), KERNELOFFSET was set in a way that drgn understands even without KASLR (that's a little odd but fine with me). - After commit 56b1069c40c7 ("s390/boot: Rework deployment of the kernel image"), KERNELOFFSET was set "incorrectly" So at least for no KASLR, the breakage has been limited only to the 6.10 rcs, which isn't too late to fix. I'd be curious what the behavior was with KASLR before 6.10, though.