Hi Thomas, On Wed, Dec 7, 2022 at 7:36 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > [...] > > Survey of the 11 architecture specific module_alloc(). They basically do > > the following magic: > > > > 1. Modify MODULES_VADDR and/or MODULES_END. There are multiple > > reasons behind this, some arch does this for KASLR, some other archs > > have different MODULES_[VADDR|END] for different processors (32b vs. > > 64b for example), some archs use some module address space for other > > things (i.e. _exiprom on arm). > > > > Archs need 1: x86, arm64, arm, mips, ppc, riscv, s390, loongarch, > > sparc > > All of this is pretty much a boot time init decision, right? Yeah, all of these are boot time or compile time decisions. > > > 2. Use kasan_alloc_module_shadow() > > > > Archs need 2: x86, arm64, s390 > > There is nothing really architecture specific, so that can be part of > the core code, right? Right, kasan_free_module_shadow() is called from vmalloc.c, so the alloc one can do the same. > > > 3. A secondary module address space. There is a smaller preferred > > address space for modules. Once the preferred space runs out, allocate > > memory from a secondary address space. [...] > > > 6. nios2 uses kmalloc() for modules. Based on the comment, this is > > probably only because it needs different MODULES_[VADDR|END]. > > It's a horrible hack because they decided to have their layout: > > VMALLOC_SPACE 0x80000000 > KERNEL_SPACE 0xC0000000 > > and they use kmalloc because CALL26/PCREL26 cannot reach from 0x80000000 > to 0xC0000000. That's true, but broken beyond repair. > > Making the layout: > > VMALLOC_SPACE 0x80000000 > MODULE_SPACE 0xBE000000 == 0xC0000000 - (1 << 24) (32M) > or > MODULE_SPACE 0xBF000000 == 0xC0000000 - (1 << 24) (16M) > KERNEL_SPACE 0xC0000000 > > would have been too obvious... Yeah, I was thinking about something like this. > > > I think we can handle all these with a single module_alloc() and a few > > module_arch_* functions(). [...] > > /** > * struct mod_alloc_type - Parameters for module allocation type > * @mapto_type: The type to merge this type into, if different > * from the actual type which is configured here. > * @flags: Properties > * @granularity: The allocation granularity (PTE/PMD) > * @alignment: The allocation alignment requirement > * @start: Array of address space range start (inclusive) > * @end: Array of address space range end (inclusive) > * @pgprot: The page protection for this type > * @fill: Function to fill allocated space. If NULL, use memcpy() > * @invalidate: Function to invalidate allocated space. If NULL, use memset() > * > * If @granularity > @alignment the allocation can reuse free space in > * previously allocated pages. If they are the same, then fresh pages > * have to be allocated. > */ > struct mod_alloc_type { > unsigned int mapto_type; > unsigned int flags; > unsigned int granularity; > unsigned int alignment; > unsigned long start[MOD_MAX_ADDR_SPACES]; > unsigned long end[MOD_MAX_ADDR_SPACES]; > pgprot_t pgprot; > void (*fill)(void *dst, void *src, unsigned int size); > void (*invalidate)(void *dst, unsigned int size); > }; Yeah, this is a lot better than arch_ functions. We probably want two more function pointers here: int (*protect)(unsigned long addr, int numpages); int (*unprotect)(unsigned long addr, int numpages); These two functions will be NULL for archs that support text_poke; while legacy archs use them for set_memory_[ro|x|rw|nx]. Then, I think we can get rid of VM_FLUSH_RESET_PERMS. [...] Everything else makes perfect sense. Thanks! I think I am ready to dive into the code and prepare the first RFC/PATCH. Please let me know if there is anything we should discuss/clarify before that. Best, Song