On Sat, 2024-12-21 at 17:04 +0100, Ard Biesheuvel wrote: > On Sat, 21 Dec 2024 at 12:33, Jarkko Sakkinen <jarkko@xxxxxxxxxx> > wrote: > > > > The following failure was reported: > > > > [ 10.693310][ T1] tpm_tis STM0925:00: 2.0 TPM (device-id 0x3, > > rev-id 0) > > [ 10.848132][ T1] ------------[ cut here ]------------ > > [ 10.853559][ T1] WARNING: CPU: 59 PID: 1 at > > mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330 > > [ 10.862827][ T1] Modules linked in: > > [ 10.866671][ T1] CPU: 59 UID: 0 PID: 1 Comm: swapper/0 Not > > tainted 6.12.0-lp155.2.g52785e2-default #1 openSUSE Tumbleweed > > (unreleased) 588cd98293a7c9eba9013378d807364c088c9375 > > [ 10.882741][ T1] Hardware name: HPE ProLiant DL320 > > Gen12/ProLiant DL320 Gen12, BIOS 1.20 10/28/2024 > > [ 10.892170][ T1] RIP: 0010:__alloc_pages_noprof+0x2ca/0x330 > > [ 10.898103][ T1] Code: 24 08 e9 4a fe ff ff e8 34 36 fa ff e9 > > 88 fe ff ff 83 fe 0a 0f 86 b3 fd ff ff 80 3d 01 e7 ce 01 00 75 09 > > c6 05 f8 e6 ce 01 01 <0f> 0b 45 31 ff e9 e5 fe ff ff f7 c2 00 00 08 > > 00 75 42 89 d9 80 e1 > > [ 10.917750][ T1] RSP: 0000:ffffb7cf40077980 EFLAGS: 00010246 > > [ 10.923777][ T1] RAX: 0000000000000000 RBX: 0000000000040cc0 > > RCX: 0000000000000000 > > [ 10.931727][ T1] RDX: 0000000000000000 RSI: 000000000000000c > > RDI: 0000000000040cc0 > > > > Above shows that ACPI pointed a 16 MiB buffer for the log events > > because RSI maps to the 'order' parameter of > > __alloc_pages_noprof(). Address the bug by mapping the region when > > needed instead of copying. > > > > Reported-by: Andy Liang <andy.liang@xxxxxxx> > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219495 > > Suggested-by: Matthew Garrett <mjg59@xxxxxxxxxxxxx> > > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > > This is a very intrusive fix - care to provide some more context on > why all these changes are needed? Since the bug reports never found an actual log over a few tens of kilobytes this is caused by the BIOS implementation allocating a huge buffer that is mostly unused. There are two other possibilities for fixing this, which were both part of the original suggestions. One would be to work out the size of the log and then allocate an exact size. This would require implementing tpm1 and tpm2 parsers for log size. However, since we can never go over KMALLOC_MAX_SIZE without an error even with this calculated size, the simplest straight line fix would be to cap the copy at KMALLOC_MAX_SIZE if it's over. That would be a simple one liner. Regards, James