On Thu, 10 Oct 2019 at 20:31, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > On Wed, Oct 9, 2019 at 11:45 PM Ard Biesheuvel > <ard.biesheuvel@xxxxxxxxxx> wrote: > > > > On Thu, 10 Oct 2019 at 01:19, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > > > UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the > > > interpretation of the EFI Memory Types as "reserved for a specific > > > purpose". > > > > > > The proposed Linux behavior for specific purpose memory is that it is > > > reserved for direct-access (device-dax) by default and not available for > > > any kernel usage, not even as an OOM fallback. Later, through udev > > > scripts or another init mechanism, these device-dax claimed ranges can > > > be reconfigured and hot-added to the available System-RAM with a unique > > > node identifier. This device-dax management scheme implements "soft" in > > > the "soft reserved" designation by allowing some or all of the > > > reservation to be recovered as typical memory. This policy can be > > > disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with > > > efi=nosoftreserve. > > > > > > This patch introduces 2 new concepts at once given the entanglement > > > between early boot enumeration relative to memory that can optionally be > > > reserved from the kernel page allocator by default. The new concepts > > > are: > > > > > > - E820_TYPE_SOFT_RESERVED: Upon detecting the EFI_MEMORY_SP > > > attribute on EFI_CONVENTIONAL memory, update the E820 map with this > > > new type. Only perform this classification if the > > > CONFIG_EFI_SOFT_RESERVE=y policy is enabled, otherwise treat it as > > > typical ram. > > > > > > - IORES_DESC_SOFT_RESERVED: Add a new I/O resource descriptor for > > > a device driver to search iomem resources for application specific > > > memory. Teach the iomem code to identify such ranges as "Soft Reserved". > > > > > > A follow-on change integrates parsing of the ACPI HMAT to identify the > > > node and sub-range boundaries of EFI_MEMORY_SP designated memory. For > > > now, just identify and reserve memory of this type. > > > > > > Cc: <x86@xxxxxxxxxx> > > > Cc: Borislav Petkov <bp@xxxxxxxxx> > > > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > > > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> > > > Cc: Darren Hart <dvhart@xxxxxxxxxxxxx> > > > Cc: Andy Shevchenko <andy@xxxxxxxxxxxxx> > > > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > > Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > > > Reported-by: kbuild test robot <lkp@xxxxxxxxx> > > > Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > For the EFI changes > > > > Acked-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > > > > although I must admit I don't follow the enum add_efi_mode logic 100% > > I'm open to suggestions as I'm not sure it's the best possible > organization. The do_add_efi_memmap() routine has the logic to > translate EFI to E820, but unless "add_efi_memmap" is specified on the > kernel command line the EFI memory map is ignored. For > soft-reservation support I want to reuse do_add_efi_memmap(), but > otherwise avoid any other side effects of considering the EFI map. > What I'm missing is the rationale for why "add_efi_memmap" is required > before considering the EFI memory map. > > If there is a negative side effect to always using the EFI map then > the new "add_efi_mode" designation constrains it to just the > soft-reservation case. > Could we make the presence of any EFI_MEMORY_SP regions imply add_efi_memmap? That way, it is guaranteed that we don't regress existing systems, while establishing clear and unambiguous semantics for new systems that rely on these changes in order to be able to use the special purpose memory as intended.