On 4/28/2021 12:57 AM, Greg Kroah-Hartman wrote:
On Tue, Apr 27, 2021 at 06:18:05PM -0400, George Kennedy wrote:
CC+ stable@xxxxxxxxxxxxxxx
On 4/27/2021 6:17 PM, George Kennedy wrote:
Hello Greg,
We need the following 2 upstream commits applied to 5.4.y to fix an iBFT
boot failure:
2021-03-29 rafael.j.wysocki@xxxxxxxxx - 1a1c130a 2021-03-23 Rafael J.
Wysocki ACPI: tables: x86: Reserve memory occupied by ACPI tables
2021-04-13 rafael.j.wysocki@xxxxxxxxx - 6998a88 2021-04-13 Rafael J.
Wysocki ACPI: x86: Call acpi_boot_table_init() after
acpi_table_upgrade()
Currently, only the first commit (1a1c130a) is destined for 5.10 & 5.11.
The 2nd commit (6998a88) is needed as well and both commits are needed
in 5.4.y.
Is this a regression (i.e. did this hardware work on older kernels?),
and if so, what commit caused the problem?
These commits are already in 5.10.y, what changed in older kernels to
require this to be backported?
Not sure. With KASAN enabled the bug is exposed, but only during boot as
the ACPI tables are freed and their memory re-alloc'd. Silent data
corruption occurs if KASAN not enabled.
This is a latent bug that in upstream was more readily exposed with the
following commit:
commit 7fef431be9c9ac255838a9578331567b9dba4477
Author: David Hildenbrand <david@xxxxxxxxxx>
Date: Thu Oct 15 20:09:35 2020 -0700 mm/page_alloc: place pages to tail in __free_pages_core()
This is the failure with latest upstream stable and KASAN enabled:
[ 22.986842] OPA Virtual Network Driver - v1.0
[ 22.988565] iBFT detected.
[ 22.989244]
==================================================================
[ 22.990233] BUG: KASAN: use-after-free in ibft_init+0x134/0xb8b
[ 22.990233] Read of size 4 at addr ffff8880be451004 by task swapper/0/1
[ 22.990233]
[ 22.990233] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.4.115-rc1.syzk #1
[ 22.990233] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[ 22.990233] Call Trace:
[ 22.990233] dump_stack+0xd4/0x119
[ 22.990233] ? ibft_init+0x134/0xb8b
[ 22.990233] print_address_description.constprop.6+0x20/0x220
[ 22.990233] ? ibft_init+0x134/0xb8b
[ 22.990233] ? ibft_init+0x134/0xb8b
[ 22.990233] __kasan_report.cold.9+0x37/0x77
[ 22.990233] ? ibft_init+0x134/0xb8b
[ 22.990233] kasan_report+0x14/0x20
[ 22.990233] __asan_report_load_n_noabort+0xf/0x20
[ 22.990233] ibft_init+0x134/0xb8b
[ 22.990233] ? dmi_sysfs_init+0x1a5/0x1a5
[ 22.990233] ? dmi_walk+0x72/0x90
[ 22.990233] ? ibft_check_initiator_for+0x159/0x159
[ 22.990233] ? rvt_init_port+0x110/0x110
[ 22.990233] ? ibft_check_initiator_for+0x159/0x159
[ 22.990233] do_one_initcall+0xc3/0x480
[ 22.990233] ? perf_trace_initcall_level+0x410/0x410
[ 22.990233] kernel_init_freeable+0x54c/0x66e
[ 22.990233] ? start_kernel+0x94b/0x94b
[ 22.990233] ? __switch_to_asm+0x34/0x70
[ 22.990233] ? __sanitizer_cov_trace_const_cmp1+0x1a/0x20
[ 22.990233] ? __kasan_check_write+0x14/0x20
[ 22.990233] ? rest_init+0xe6/0xe6
[ 22.990233] kernel_init+0x16/0x1ca
[ 22.990233] ? rest_init+0xe6/0xe6
[ 22.990233] ret_from_fork+0x35/0x40
[ 22.990233]
[ 22.990233] The buggy address belongs to the page:
[ 22.990233] page:ffffea0002f91440 refcount:0 mapcount:0
mapping:0000000000000000 index:0x1
[ 22.990233] flags: 0xfffffc0000000()
[ 22.990233] raw: 000fffffc0000000 ffffea0002f914c8 ffffea0002fa4708
0000000000000000
[ 22.990233] raw: 0000000000000001 0000000000000000 00000000ffffffff
0000000000000000
[ 22.990233] page dumped because: kasan: bad access detected
[ 22.990233]
[ 22.990233] Memory state around the buggy address:
[ 22.990233] ffff8880be450f00: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 22.990233] ffff8880be450f80: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 22.990233] >ffff8880be451000: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 22.990233] ^
[ 22.990233] ffff8880be451080: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 22.990233] ffff8880be451100: ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff
[ 22.990233]
==================================================================
[ 22.990233] Disabling lock debugging due to kernel taint
[ 23.047129] Kernel panic - not syncing: panic_on_warn set ...
[ 23.048110] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G B
5.4.115-rc1v5.4.114-21-gf9824ac.syzk #1
[ 23.048110] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 0.0.0 02/06/2015
[ 23.048110] Call Trace:
[ 23.048110] dump_stack+0xd4/0x119
[ 23.048110] ? ibft_init+0xc3/0xb8b
[ 23.048110] panic+0x28f/0x6ad
[ 23.048110] ? add_taint.cold.9+0x16/0x16
[ 23.048110] ? ibft_init+0x134/0xb8b
[ 23.048110] ? add_taint+0x47/0x90
[ 23.048110] ? add_taint+0x47/0x90
[ 23.048110] ? ibft_init+0x134/0xb8b
[ 23.048110] ? ibft_init+0x134/0xb8b
[ 23.048110] end_report+0x4c/0x54
[ 23.048110] __kasan_report.cold.9+0x55/0x77
[ 23.048110] ? ibft_init+0x134/0xb8b
[ 23.048110] kasan_report+0x14/0x20
[ 23.048110] __asan_report_load_n_noabort+0xf/0x20
[ 23.048110] ibft_init+0x134/0xb8b
[ 23.048110] ? dmi_sysfs_init+0x1a5/0x1a5
[ 23.048110] ? dmi_walk+0x72/0x90
[ 23.048110] ? ibft_check_initiator_for+0x159/0x159
[ 23.048110] ? rvt_init_port+0x110/0x110
[ 23.048110] ? ibft_check_initiator_for+0x159/0x159
[ 23.048110] do_one_initcall+0xc3/0x480
[ 23.048110] ? perf_trace_initcall_level+0x410/0x410
[ 23.048110] kernel_init_freeable+0x54c/0x66e
[ 23.048110] ? start_kernel+0x94b/0x94b
[ 23.048110] ? __switch_to_asm+0x34/0x70
[ 23.048110] ? __sanitizer_cov_trace_const_cmp1+0x1a/0x20
[ 23.048110] ? __kasan_check_write+0x14/0x20
[ 23.048110] ? rest_init+0xe6/0xe6
[ 23.048110] kernel_init+0x16/0x1ca
[ 23.048110] ? rest_init+0xe6/0xe6
[ 23.048110] ret_from_fork+0x35/0x40
[ 23.048110] Dumping ftrace buffer:
[ 23.048110] ---------------------------------
[ 23.048110] rb_produ-210 3.... 7555323us :
ring_buffer_producer_thread: Starting ring buffer hammer
[ 23.048110] rb_produ-210 3.... 17555348us :
ring_buffer_producer_thread: End ring buffer hammer
[ 23.048110] rb_produ-210 3.... 17640105us :
ring_buffer_producer_thread: Running Consumer at nice: 19
[ 23.048110] rb_produ-210 3.... 17640111us :
ring_buffer_producer_thread: Running Producer at nice: 19
[ 23.048110] rb_produ-210 3.... 17640113us :
ring_buffer_producer_thread: WARNING!!! This test is running at lowest
priority.
[ 23.048110] rb_produ-210 3.... 17640118us :
ring_buffer_producer_thread: Time: 10000017 (usecs)
[ 23.048110] rb_produ-210 3.... 17640122us :
ring_buffer_producer_thread: Overruns: 4460970
[ 23.048110] rb_produ-210 3.... 17640129us :
ring_buffer_producer_thread: Read: 3807780 (by events)
[ 23.048110] rb_produ-210 3.... 17640134us :
ring_buffer_producer_thread: Entries: 0
[ 23.048110] rb_produ-210 3.... 17640137us :
ring_buffer_producer_thread: Total: 8268750
[ 23.048110] rb_produ-210 3.... 17640142us :
ring_buffer_producer_thread: Missed: 0
[ 23.048110] rb_produ-210 3.... 17640146us :
ring_buffer_producer_thread: Hit: 8268750
[ 23.048110] rb_produ-210 3.... 17640150us :
ring_buffer_producer_thread: Entries per millisec: 826
[ 23.048110] rb_produ-210 3.... 17640154us :
ring_buffer_producer_thread: 1210 ns per entry
[ 23.048110] rb_produ-210 3.... 17640157us :
ring_buffer_producer_thread: Sleeping for 10 secs
[ 23.048110] ---------------------------------
2021-04-26 gregkh@xxxxxxxxxxxxxxxxxxx - f9824ac 2021-04-26 Greg
Kroah-Hartman Linux 5.4.115-rc1
Because the failure occurs during boot, syzkaller did not expose this bug.
George
thanks,
greg k-h