On Fri, Apr 5, 2019 at 9:21 PM Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: > > Hi Dan, > > On Thu, 4 Apr 2019 at 21:21, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the > > interpretation of the EFI Memory Types as "reserved for a special > > purpose". > > > > The proposed Linux behavior for special purpose memory is that it is > > reserved for direct-access (device-dax) by default and not available for > > any kernel usage, not even as an OOM fallback. Later, through udev > > scripts or another init mechanism, these device-dax claimed ranges can > > be reconfigured and hot-added to the available System-RAM with a unique > > node identifier. > > > > A follow-on patch integrates parsing of the ACPI HMAT to identify the > > node and sub-range boundaries of EFI_MEMORY_SP designated memory. For > > now, arrange for EFI_MEMORY_SP memory to be reserved. > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > > Cc: Borislav Petkov <bp@xxxxxxxxx> > > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx> > > Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> > > Cc: Darren Hart <dvhart@xxxxxxxxxxxxx> > > Cc: Andy Shevchenko <andy@xxxxxxxxxxxxx> > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > --- > > arch/x86/Kconfig | 18 ++++++++++++++++++ > > arch/x86/boot/compressed/eboot.c | 5 ++++- > > arch/x86/boot/compressed/kaslr.c | 2 +- > > arch/x86/include/asm/e820/types.h | 9 +++++++++ > > arch/x86/kernel/e820.c | 9 +++++++-- > > arch/x86/platform/efi/efi.c | 10 +++++++++- > > include/linux/efi.h | 14 ++++++++++++++ > > include/linux/ioport.h | 1 + > > 8 files changed, 63 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index c1f9b3cf437c..cb9ca27de7a5 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -1961,6 +1961,24 @@ config EFI_MIXED > > > > If unsure, say N. > > > > +config EFI_SPECIAL_MEMORY > > + bool "EFI Special Purpose Memory Support" > > + depends on EFI > > + ---help--- > > + On systems that have mixed performance classes of memory EFI > > + may indicate special purpose memory with an attribute (See > > + EFI_MEMORY_SP in UEFI 2.8). A memory range tagged with this > > + attribute may have unique performance characteristics compared > > + to the system's general purpose "System RAM" pool. On the > > + expectation that such memory has application specific usage > > + answer Y to arrange for the kernel to reserve it for > > + direct-access (device-dax) by default. The memory range can > > + later be optionally assigned to the page allocator by system > > + administrator policy. Say N to have the kernel treat this > > + memory as general purpose by default. > > + > > + If unsure, say Y. > > + > > EFI_MEMORY_SP is now part of the UEFI spec proper, so it does not make > sense to make any understanding of it Kconfigurable. No, I think you're misunderstanding what this Kconfig option is trying to achieve. The configuration capability is solely for the default kernel policy. As can already be seen by Christoph's response [1] the thought that the firmware gets more leeway to dictate to Linux memory policy may be objectionable. [1]: https://lore.kernel.org/lkml/20190409121318.GA16955@xxxxxxxxxxxxx/ So the Kconfig option is gating whether the kernel simply ignores the attribute and gives it to the page allocator by default. Anything fancier, like sub-dividing how much is OS managed vs device-dax accessed requires the OS to reserve it all from the page-allocator by default until userspace policy can be applied. > Instead, what I would prefer is to implement support for EFI_MEMORY_SP > unconditionally (including the ability to identify it in the debug > dump of the memory map etc), in a way that all architectures can use > it. Then, I think we should never treat it as ordinary memory and make > it the firmware's problem not to use the EFI_MEMORY_SP attribute in > cases where it results in undesired behavior in the OS. No, a policy of "never treat it as ordinary memory" confuses the base intent of the attribute which is an optional hint to get the OS to not put immovable / non-critical allocations in what could be a precious resource. Moreover, the interface for platform firmware to indicate that a memory range should never be treated as ordinary memory is simply the existing "reserved" memory type, not this attribute. That's the mechanism to use when platform firmware knows that a driver is needed for a given mmio resource. > Also, sInce there is a generic component and a x86 component, can you > please split those up? Sure, can do. > > You only cc'ed me on patch #1 this time, but could you please cc me on > the entire series for v2? Thanks. Yes, will do, and thanks for taking a look.