On Wed, 8 Jan 2020 at 22:53, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > On Tue, Jan 7, 2020 at 9:52 AM Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > > > On Tue, 7 Jan 2020 at 06:19, Dave Young <dyoung@xxxxxxxxxx> wrote: > > > > > > On 01/06/20 at 08:16pm, Dan Williams wrote: > > > > On Mon, Jan 6, 2020 at 8:04 PM Dave Young <dyoung@xxxxxxxxxx> wrote: > > > > > > > > > > On 01/06/20 at 04:40pm, Dan Williams wrote: > > > > > > Dave noticed that when specifying multiple efi_fake_mem= entries only > > > > > > the last entry was successfully being reflected in the efi memory map. > > > > > > This is due to the fact that the efi_memmap_insert() is being called > > > > > > multiple times, but on successive invocations the insertion should be > > > > > > applied to the last new memmap rather than the original map at > > > > > > efi_fake_memmap() entry. > > > > > > > > > > > > Rework efi_fake_memmap() to install the new memory map after each > > > > > > efi_fake_mem= entry is parsed. > > > > > > > > > > > > This also fixes an issue in efi_fake_memmap() that caused it to litter > > > > > > emtpy entries into the end of the efi memory map. An empty entry causes > > > > > > efi_memmap_insert() to attempt more memmap splits / copies than > > > > > > efi_memmap_split_count() accounted for when sizing the new map. When > > > > > > that happens efi_memmap_insert() may overrun its allocation, and if you > > > > > > are lucky will spill over to an unmapped page leading to crash > > > > > > signature like the following rather than silent corruption: > > > > > > > > > > > > BUG: unable to handle page fault for address: ffffffffff281000 > > > > > > [..] > > > > > > RIP: 0010:efi_memmap_insert+0x11d/0x191 > > > > > > [..] > > > > > > Call Trace: > > > > > > ? bgrt_init+0xbe/0xbe > > > > > > ? efi_arch_mem_reserve+0x1cb/0x228 > > > > > > ? acpi_parse_bgrt+0xa/0xd > > > > > > ? acpi_table_parse+0x86/0xb8 > > > > > > ? acpi_boot_init+0x494/0x4e3 > > > > > > ? acpi_parse_x2apic+0x87/0x87 > > > > > > ? setup_acpi_sci+0xa2/0xa2 > > > > > > ? setup_arch+0x8db/0x9e1 > > > > > > ? start_kernel+0x6a/0x547 > > > > > > ? secondary_startup_64+0xb6/0xc0 > > > > > > > > > > > > Commit af1648984828 "x86/efi: Update e820 with reserved EFI boot > > > > > > services data to fix kexec breakage" is listed in Fixes: since it > > > > > > introduces more occurrences where efi_memmap_insert() is invoked after > > > > > > an efi_fake_mem= configuration has been parsed. Previously the side > > > > > > effects of vestigial empty entries were benign, but with commit > > > > > > af1648984828 that follow-on efi_memmap_insert() invocation triggers > > > > > > efi_memmap_insert() overruns. > > > > > > > > > > > > Fixes: 0f96a99dab36 ("efi: Add 'efi_fake_mem' boot option") > > > > > > Fixes: af1648984828 ("x86/efi: Update e820 with reserved EFI boot services...") > > > > > > > > > > A nitpick for the Fixes flags, as I replied in the thread below: > > > > > https://lore.kernel.org/linux-efi/CAPcyv4jLxqPaB22Ao9oV31Gm=b0+Phty+Uz33Snex4QchOUb0Q@xxxxxxxxxxxxxx/T/#m2bb2dd00f7715c9c19ccc48efef0fcd5fdb626e7 > > > > > > > > > > I reproduced two other panics without the patches applied, so this issue > > > > > is not caused by either of the commits, maybe just drop the Fixes. > > > > > > > > Just the "Fixes: af1648984828", right? No objection from me. I'll let > > > > Ingo say if he needs a resend for that. > > > > > > > > The "Fixes: 0f96a99dab36" is valid because the original implementation > > > > failed to handle the multiple argument case from the beginning. > > > > > > Agreed, thanks! > > > > > > > I'll queue this but without the fixes tags. The -stable maintainers > > are far too trigger happy IMHO, and this really needs careful review > > before being backported. efi_fake_mem is a debug feature anyway, so I > > don't see an urgent need to get this fixed retroactively in older > > kernels. > > I'm fine to drop the fixes tags. > > However, I do want to point out my driving motive for digging in on > efi_fake_mem=nn@ss:0x40000, is that it is a better interface for > diverting memory ranges to device-dax than the current standard bearer > memmap=nn!ss. The main benefit is that the kernel only considers the > attribute when it is applied to EFI_CONVENTIONAL_MEMORY. This fixes a > long standing unsolvable issue of people picking busted memmap=nn!ss > settings that clobber platform memory ranges, or vector off into > nothing. > > So yes, efi_fake_mem is a debug feature, but if the popularity of > memmap=nn!ss is any clue I expect efi_fake_mem=nn@ss:0x40000 will be a > useful tool going forward for late enabling, or repairing platform > "soft reservation" declarations. > OK, good to know. > I'll respin the series with those tags dropped and add the comment you > recommended about the cases when efi_memmap_free() is expected to be a > nop. Holler if there's anything else, but that's all I had on my list > to fix up. If it's just for the comment, I can just slap that on, as I already queued the patches with the fixes tags dropped. Or respin, whichever you prefer (efi/next branch is not stable anyway)