On 31.08.20 11:35, Pankaj Gupta wrote: >> Some add_memory*() users add memory in small, contiguous memory blocks. >> Examples include virtio-mem, hyper-v balloon, and the XEN balloon. >> >> This can quickly result in a lot of memory resources, whereby the actual >> resource boundaries are not of interest (e.g., it might be relevant for >> DIMMs, exposed via /proc/iomem to user space). We really want to merge >> added resources in this scenario where possible. >> >> Let's provide an interface to trigger merging of applicable child >> resources. It will be, for example, used by virtio-mem to trigger >> merging of system ram resources it added to its resource container, but >> also by XEN and Hyper-V to trigger merging of system ram resources in >> iomem_resource. >> >> Note: We really want to merge after the whole operation succeeded, not >> directly when adding a resource to the resource tree (it would break >> add_memory_resource() and require splitting resources again when the >> operation failed - e.g., due to -ENOMEM). >> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxx> >> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >> Cc: Jason Gunthorpe <jgg@xxxxxxxx> >> Cc: Kees Cook <keescook@xxxxxxxxxxxx> >> Cc: Ard Biesheuvel <ardb@xxxxxxxxxx> >> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >> Cc: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> >> Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> >> Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx> >> Cc: Wei Liu <wei.liu@xxxxxxxxxx> >> Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> >> Cc: Juergen Gross <jgross@xxxxxxxx> >> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> >> Cc: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> Cc: Julien Grall <julien@xxxxxxx> >> Cc: Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx> >> Cc: Baoquan He <bhe@xxxxxxxxxx> >> Cc: Wei Yang <richardw.yang@xxxxxxxxxxxxxxx> >> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >> --- >> include/linux/ioport.h | 3 +++ >> kernel/resource.c | 52 ++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 55 insertions(+) >> >> diff --git a/include/linux/ioport.h b/include/linux/ioport.h >> index 52a91f5fa1a36..3bb0020cd6ddc 100644 >> --- a/include/linux/ioport.h >> +++ b/include/linux/ioport.h >> @@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t, >> extern void release_mem_region_adjustable(struct resource *, resource_size_t, >> resource_size_t); >> #endif >> +#ifdef CONFIG_MEMORY_HOTPLUG >> +extern void merge_system_ram_resources(struct resource *res); >> +#endif >> >> /* Wrappers for managed devices */ >> struct device; >> diff --git a/kernel/resource.c b/kernel/resource.c >> index 1dcef5d53d76e..b4e0963edadd2 100644 >> --- a/kernel/resource.c >> +++ b/kernel/resource.c >> @@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent, >> } >> #endif /* CONFIG_MEMORY_HOTREMOVE */ >> >> +#ifdef CONFIG_MEMORY_HOTPLUG >> +static bool system_ram_resources_mergeable(struct resource *r1, >> + struct resource *r2) >> +{ >> + return r1->flags == r2->flags && r1->end + 1 == r2->start && >> + r1->name == r2->name && r1->desc == r2->desc && >> + !r1->child && !r2->child; >> +} >> + >> +/* >> + * merge_system_ram_resources - try to merge contiguous system ram resources >> + * @parent: parent resource descriptor >> + * >> + * This interface is intended for memory hotplug, whereby lots of contiguous >> + * system ram resources are added (e.g., via add_memory*()) by a driver, and >> + * the actual resource boundaries are not of interest (e.g., it might be >> + * relevant for DIMMs). Only immediate child resources that are busy and >> + * don't have any children are considered. All applicable child resources >> + * must be immutable during the request. >> + * >> + * Note: >> + * - The caller has to make sure that no pointers to resources that might >> + * get merged are held anymore. Callers should only trigger merging of child >> + * resources when they are the only one adding system ram resources to the >> + * parent (besides during boot). >> + * - release_mem_region_adjustable() will split on demand on memory hotunplug >> + */ >> +void merge_system_ram_resources(struct resource *parent) >> +{ >> + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; >> + struct resource *cur, *next; >> + >> + write_lock(&resource_lock); >> + >> + cur = parent->child; >> + while (cur && cur->sibling) { >> + next = cur->sibling; >> + if ((cur->flags & flags) == flags && > > Maybe this can be changed to: > !(cur->flags & ~flags) That would be different I think. (cur->flags & flags) == flags checks that all "flags" are set (additional ones might be set). !(cur->flags & ~flags) checks that no other flags besides "flags" are set (and "flags" are not required to be set). We use the same handling in find_next_iomem_res(), e.g., called via walk_system_ram_range also with IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY. Thanks for having a look! -- Thanks, David / dhildenb