On Mon, Oct 14, 2024 at 10:32:36PM +0200, David Hildenbrand wrote: > On 14.10.24 16:25, Gregory Price wrote: > > On Mon, Oct 14, 2024 at 01:54:27PM +0200, David Hildenbrand wrote: > > > On 08.10.24 17:21, Gregory Price wrote: > > > > On Tue, Oct 08, 2024 at 05:02:33PM +0200, David Hildenbrand wrote: > > > > > On 08.10.24 16:51, Gregory Price wrote: > > > > > > > > +int __weak set_memory_block_size_order(unsigned int order) > > > > > > > > +{ > > > > > > > > + return -ENODEV; > > > > > > > > +} > > > > > > > > +EXPORT_SYMBOL_GPL(set_memory_block_size_order); > > > > > > > > > > > > > > I can understand what you are trying to achieve, but letting arbitrary > > > > > > > modules mess with this sounds like a bad idea. > > > > > > > > > > > > > > > > > > > I suppose the alternative is trying to scan the CEDT from inside each > > > > > > machine, rather than the ACPI driver? Seems less maintainable. > > > > > > > > > > > > I don't entirely disagree with your comment. I hummed and hawwed over > > > > > > externing this - hence the warning in the x86 machine. > > > > > > > > > > > > Open to better answers. > > > > > > > > > > Maybe an interface to add more restrictions on the maximum size might be > > > > > better (instead of setting the size/order, you would impose another upper > > > > > limit). > > > > > > > > That is effectively what set_memory_block_size_order is, though. Once > > > > blocks are exposed to the allocators, its no longer safe to change the > > > > size (in part because it was built assuming it wouldn't change, but I > > > > imagine there are other dragons waiting in the shadows to bite me). > > > > > > Yes, we must run very early. > > > > > > How is this supposed to interact with code like > > > > > > set_block_size() > > > > > > that also calls set_memory_block_size_order() on UV systems (assuming there > > > will be CXL support sooner or later?)? > > > > > > > > > > Tying the other email to this one - just clarifying the way forward here. > > > > It sounds like you're saying at a minimum drop EXPORT tags to prevent > > modules from calling it - but it also sounds like built-ins need to be > > prevented from touching it as well after a certain point in early boot. > > Right, at least the EXPORT is not required. > > > > > Do you think I should go down the advise() path as suggested by Ira, > > just adding a arch_lock_blocksize() bit and have set_..._order check it, > > or should we just move towards each architecture having to go through > > the ACPI:CEDT itself? > > Let's summarize what we currently have on x86 is: > > 1) probe_memory_block_size() > > Triggered on first memory_block_size_bytes() invocation. Makes a decision > based on: > > a) Already set size using set_memory_block_size_order() > b) RAM size > c) Bare metal vs. virt (bare metal -> use max) > d) Virt: largest block size aligned to memory end > > > 2) set_memory_block_size_order() > > Triggered by set_block_size() on UV systems. > > > I don't think set_memory_block_size_order() is the right tool to use. We > just want to leave that alone I think -- it's a direct translation of a > kernel cmdline parameter that should win. > > You essentially want to tweak the b)->d) logic to take other alignment into > consideration. > > Maybe have some simple callback mechanism probe_memory_block_size() that can > consult other sources for alignment requirements? > Thanks for this - I'll cobble something together. Probably this ends up falling out similar to what Ira suggested. drivers/acpi/numa/srat.c acpi_numa_init(): order = parse_cfwm(...) memblock_advise_size(order); drivers/base/memory.c static int memblock_size_order = 0; /* let arch choose */ int memblock_advise_size(order) int old_order; int new_order; if (order <= 0) return -EINVAL; do { old_order = memblock_size_order; new_order = MIN(old_order, order); } while (!atomic_cmpxchg(&memblock_size_order, old_order, new_order)); /* memblock_size_order is now <= order, if -1 then the probe won */ return new_order; int memblock_probe_size() return atomic_xchg(&memblock_size_order, -1); drivers/base/memblock.h #ifdef HOTPLUG export memblock_advise_size() export memblock_probe_size() #else static memblock_advice_size() { return -ENODEV; } /* always fail */ static memblock_probe_size() { return 0; } /* arch chooses */ #endif arch/*/mm/... probe_block_size(): memblock_probe_size(); /* select minimum across above suggested values */ > If that's not an option, then another way to set further min-alignment > requirements (whereby we take MIN(old_align, new_align))? > > -- > Cheers, > > David / dhildenb >