> On May 19, 2022, at 9:56 AM, Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote: > > On Thu, 2022-05-19 at 06:42 +0000, Song Liu wrote: >> Thinking more on this. Even huge page is not supported, we can >> allocate >> 2MB worth of 4kB pages and keep using it. This would help direct map >> fragmentation. And the code would also be simpler. >> >> Rick, I guess this is inline with some of your ideas? > > Yea, that is what I wondering. Potential benefits are just speculative > though. There is a memory overhead cost, so it's not free. Yeah, I had the same concern with memory overhead. The benefit should not exceed 0.2% (when 2MB page is supported), so I think we can use 4kB pages for now. > > As for the other question of whether to fix VM_FLUSH_RESET_PERMS. If > there really is an intention to create a more general module_alloc() > replacement soon, then I think it is ok to side step it. An optimal > replacement might not need it and it could be removed in that case. > Let's at least add a WARN about it not working with huge pages though. IIUC, it will take some effort to let kernel modules use a solution like this. But there are some low hanging fruits, like ftrace and BPF trampolines. Maybe that's enough to promote a more general solution. For the WARN, I guess we need something like this? diff --git i/include/linux/vmalloc.h w/include/linux/vmalloc.h index b159c2789961..5e0d0a60d9d5 100644 --- i/include/linux/vmalloc.h +++ w/include/linux/vmalloc.h @@ -238,6 +238,7 @@ static inline void set_vm_flush_reset_perms(void *addr) { struct vm_struct *vm = find_vm_area(addr); + WARN_ON_ONCE(is_vm_area_hugepages(addr)); if (vm) vm->flags |= VM_FLUSH_RESET_PERMS; } Thanks, Song > > I also think the benchmarking so far is not sufficient to make the case > that huge page mappings help your workload since the direct map splits > were also different between the tests. I was expecting it to help > though. Others were the ones that asked for that, so just commenting my > analysis here.