On 2/23/23 22:25, Michael Ellerman wrote:
The TPM code in prom_init.c creates a small buffer of memory to store the TPM's SML (Stored Measurement Log). It's communicated to Linux via the linux,sml-base/size device tree properties of the TPM node. When kexec'ing that buffer can be overwritten, or when kdump'ing it may not be mapped by the second kernel. The latter can lead to a crash when booting the second kernel such as: tpm_ibmvtpm 71000003: CRQ initialization completed BUG: Unable to handle kernel data access on read at 0xc00000002ffb0000 Faulting instruction address: 0xc0000000200a70e0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc2-00134-g9307ce092f5d #314 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,git-5b4c5a pSeries NIP: c0000000200a70e0 LR: c0000000203dd5dc CTR: 0000000000000800 REGS: c000000024543280 TRAP: 0300 Not tainted (6.2.0-rc2-00134-g9307ce092f5d) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24002280 XER: 00000006 CFAR: c0000000200a70c8 DAR: c00000002ffb0000 DSISR: 40000000 IRQMASK: 0 ... NIP memcpy_power7+0x400/0x7d0 LR kmemdup+0x5c/0x80 Call Trace: memcpy_power7+0x274/0x7d0 (unreliable) kmemdup+0x5c/0x80 tpm_read_log_of+0xe8/0x1b0 tpm_bios_log_setup+0x60/0x210 tpm_chip_register+0x134/0x320 tpm_ibmvtpm_probe+0x520/0x7d0 vio_bus_probe+0x9c/0x460 really_probe+0x104/0x420 __driver_probe_device+0xb0/0x170 driver_probe_device+0x58/0x180 __driver_attach+0xd8/0x250 bus_for_each_dev+0xb4/0x140 driver_attach+0x34/0x50 bus_add_driver+0x1e8/0x2d0 driver_register+0xb4/0x1c0 __vio_register_driver+0x74/0x9c ibmvtpm_module_init+0x34/0x48 do_one_initcall+0x80/0x320 kernel_init_freeable+0x304/0x3ac kernel_init+0x30/0x1a0 ret_from_kernel_thread+0x5c/0x64
I have not been able to reproduce this particular crash issue with a 6.2 kernel running on P10 PowerVM when NOT applying your patches. For my tests I have used the following parameter with the 16GB VM: crashkernel=2G-4G:384M,4G-16G:1G,16G-64G:2G,64G-128G:2G,128G-:4G What I noticed is that the log gets corrupted when the 2 patches are applied: After fresh boot:
cp /sys/kernel/security/tpm0/binary_bios_measurements ./ ls -l binary_bios_measurements
-r--r-----. 1 root root 10051 Feb 28 12:09 binary_bios_measurements
kexec -l /boot/vmlinuz-6.2.0+ --initrd /boot/initramfs-6.2.0+.img '--append=BOOT_IMAGE=/vmlinuz-6.2.0+ root=/dev/mapper/rhel_XYZ ro crashkernel=2G-4G:384M,4G-16G:1G,16G-64G:2G,64G-128G:2G,128G-:4G rd.lvm.lv=rhel_XYZ/root rd.lvm.lv=rhel_XYZ/swap biosdevname=0' -s kexec -e
cp /sys/kernel/security/tpm0/binary_bios_measurements ./ ls -l binary_bios_measurements
-r--r-----. 1 root root 32 Feb 28 12:10 binary_bios_measurements
od -t x1 < binary_bios_measurements
0000000 d0 0d fe ed 00 00 77 80 00 00 00 a0 00 00 4f 4c 0000020 00 00 00 28 00 00 00 11 00 00 00 11 00 00 00 00 0000040 The contents have changed and these first 4 bytes of it are always the same once it has become this 32 byte file, otherwise they would be zero. The address and size parameters passed around in this patch seem good, though. Stefan