Le 28/12/2021 à 17:14, Dave Hansen a écrit : > On 12/28/21 2:26 AM, Kefeng Wang wrote: >>>> There are some disadvantages about this feature[2], one of the main >>>> concerns is the possible memory fragmentation/waste in some scenarios, >>>> also archs must ensure that any arch specific vmalloc allocations that >>>> require PAGE_SIZE mappings(eg, module alloc with STRICT_MODULE_RWX) >>>> use the VM_NO_HUGE_VMAP flag to inhibit larger mappings. >>> That just says that x86 *needs* PAGE_SIZE allocations. But, what >>> happens if VM_NO_HUGE_VMAP is not passed (like it was in v1)? Will the >>> subsequent permission changes just fragment the 2M mapping? >> >> Yes, without VM_NO_HUGE_VMAP, it could fragment the 2M mapping. >> >> When module alloc with STRICT_MODULE_RWX on x86, it calls >> __change_page_attr() >> >> from set_memory_ro/rw/nx which will split large page, so there is no >> need to make >> >> module alloc with HUGE_VMALLOC. > > This all sounds very fragile to me. Every time a new architecture would > get added for huge vmalloc() support, the developer needs to know to go > find that architecture's module_alloc() and add this flag. They next > guy is going to forget, just like you did. That's not correct from my point of view. When powerpc added that, a clear comment explains why: + /* + * Don't do huge page allocations for modules yet until more testing + * is done. STRICT_MODULE_RWX may require extra work to support this + * too. + */ So as you can see, this is something specific to powerpc and temporary. > > Considering that this is not a hot path, a weak function would be a nice > choice: > > /* vmalloc() flags used for all module allocations. */ > unsigned long __weak arch_module_vm_flags() > { > /* > * Modules use a single, large vmalloc(). Different > * permissions are applied later and will fragment > * huge mappings. Avoid using huge pages for modules. > */ Why ? Not everybody use STRICT_MODULES_RWX. Even if you do so, you can still benefit from huge pages for modules. Why make what was initially a temporary precaution for powerpc become a definitive default limitation for all ? > return VM_NO_HUGE_VMAP; > } > > Stick that in some the common module code, next to: > >> void * __weak module_alloc(unsigned long size) >> { >> return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END, > ... > > Then, put arch_module_vm_flags() in *all* of the module_alloc() > implementations, including the generic one. That way (even with a new > architecture) whoever copies-and-pastes their module_alloc() > implementation is likely to get it right. The next guy who just does a > "select HAVE_ARCH_HUGE_VMALLOC" will hopefully just work. > > VM_FLUSH_RESET_PERMS could probably be dealt with in the same way.