On 3/24/20 9:06 PM, Dan Williams wrote: > On Tue, Mar 24, 2020 at 12:41 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote: >> >> On 3/22/20 4:12 PM, Dan Williams wrote: >>> The hmem enabling in commit 'cf8741ac57ed ("ACPI: NUMA: HMAT: Register >>> "soft reserved" memory as an "hmem" device")' only registered ranges to >>> the hmem driver for each soft-reservation that also appeared in the >>> HMAT. While this is meant to encourage platform firmware to "do the >>> right thing" and publish an HMAT, the corollary is that platforms that >>> fail to publish an accurate HMAT will strand memory from Linux usage. >>> Additionally, the "efi_fake_mem" kernel command line option enabling >>> will strand memory by default without an HMAT. >>> >>> Arrange for "soft reserved" memory that goes unclaimed by HMAT entries >>> to be published as raw resource ranges for the hmem driver to consume. >>> >>> Include a module parameter to disable either this fallback behavior, or >>> the hmat enabling from creating hmem devices. The module parameter >>> requires the hmem device enabling to have unique name in the module >>> namespace: "device_hmem". >>> >>> Rather than mark this x86-only, include an interim phys_to_target_node() >>> implementation for arm64. >>> >>> Cc: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> >>> Cc: Brice Goglin <Brice.Goglin@xxxxxxxx> >>> Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> >>> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx> >>> Cc: Jeff Moyer <jmoyer@xxxxxxxxxx> >>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> >>> Cc: Will Deacon <will@xxxxxxxxxx> >>> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> >>> --- >>> arch/arm64/mm/numa.c | 13 +++++++++++++ >>> drivers/dax/Kconfig | 1 + >>> drivers/dax/hmem/Makefile | 3 ++- >>> drivers/dax/hmem/device.c | 33 +++++++++++++++++++++++++++++++++ >>> 4 files changed, 49 insertions(+), 1 deletion(-) >>> >> >> [...] >> >>> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c >>> index 99bc15a8b031..f9c5fa8b1880 100644 >>> --- a/drivers/dax/hmem/device.c >>> +++ b/drivers/dax/hmem/device.c >>> @@ -4,6 +4,9 @@ >>> #include <linux/module.h> >>> #include <linux/mm.h> >>> >>> +static bool nohmem; >>> +module_param_named(disable, nohmem, bool, 0444); >>> + >>> void hmem_register_device(int target_nid, struct resource *r) >>> { >>> /* define a clean / non-busy resource for the platform device */ >>> @@ -16,6 +19,9 @@ void hmem_register_device(int target_nid, struct resource *r) >>> struct memregion_info info; >>> int rc, id; >>> >>> + if (nohmem) >>> + return; >>> + >>> rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM, >>> IORES_DESC_SOFT_RESERVED); >>> if (rc != REGION_INTERSECTS) >>> @@ -62,3 +68,30 @@ void hmem_register_device(int target_nid, struct resource *r) >>> out_pdev: >>> memregion_free(id); >>> } >>> + >>> +static __init int hmem_register_one(struct resource *res, void *data) >>> +{ >>> + /* >>> + * If the resource is not a top-level resource it was already >>> + * assigned to a device by the HMAT parsing. >>> + */ >>> + if (res->parent != &iomem_resource) >>> + return 0; >>> + >>> + hmem_register_device(phys_to_target_node(res->start), res); >>> + >>> + return 0; >> >> Should we add an error returning value to hmem_register_device() perhaps this >> ought to be reflected in hmem_register_one(). >> >>> +} >>> + >>> +static __init int hmem_init(void) >>> +{ >>> + walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED, >>> + IORESOURCE_MEM, 0, -1, NULL, hmem_register_one); >>> + return 0; >>> +} >>> + >> >> (...) and then perhaps here returning in the initcall if any of the resources >> failed hmem registration? > > Except that hmem_register_one() is a stop-gap to collect soft-reserved > ranges that were not already registered, and it's not an error to find > already registered devices. > /nods And if we were to return an error (say for hmem0 out of 4 hmem ones) before walking through all soft-reserved found resources, if would skip registration for the remaining ones. Joao