On Thu 30-03-23 08:13:48, Mike Rapoport wrote: > On Wed, Mar 29, 2023 at 10:13:23AM +0200, Michal Hocko wrote: > > On Wed 29-03-23 10:28:02, Mike Rapoport wrote: > > > On Tue, Mar 28, 2023 at 05:24:49PM +0200, Michal Hocko wrote: > > > > On Tue 28-03-23 18:11:20, Mike Rapoport wrote: > > > > > On Tue, Mar 28, 2023 at 09:39:37AM +0200, Michal Hocko wrote: > > > > [...] > > > > > > OK, so you want to reduce that direct map fragmentation? > > > > > > > > > > Yes. > > > > > > > > > > > Is that a real problem? > > > > > > > > > > A while ago Intel folks published report [1] that showed better performance > > > > > with large pages in the direct map for majority of benchmarks. > > > > > > > > > > > My impression is that modules are mostly static thing. BPF > > > > > > might be a different thing though. I have a recollection that BPF guys > > > > > > were dealing with direct map fragmention as well. > > > > > > > > > > Modules are indeed static, but module_alloc() used by anything that > > > > > allocates code pages, e.g. kprobes, ftrace and BPF. Besides, Thomas > > > > > mentioned that having code in 2M pages reduces iTLB pressure [2], but > > > > > that's not only about avoiding the splits in the direct map but also about > > > > > using large mappings in the modules address space. > > > > > > > > > > BPF guys suggested an allocator for executable memory [3] mainly because > > > > > they've seen performance improvement of 0.6% - 0.9% in their setups [4]. > > > > > > > > These are fair arguments and it would have been better to have them in > > > > the RFC. Also it is not really clear to me what is the actual benefit of > > > > the unmapping for those usecases. I do get they want to benefit from > > > > caching on the same permission setup but do they need unmapping as well? > > > > > > The pages allocated with module_alloc() get different permissions depending > > > on whether they belong to text, rodata, data etc. The permissions are > > > updated in both vmalloc address space and in the direct map. The updates to > > > the direct map cause splits of the large pages. > > > > That much is clear (wouldn't hurt to mention that in the changelog > > though). > > > > > If we cache large pages as > > > unmapped we take out the entire 2M page from the direct map and then > > > if/when it becomes free it can be returned to the direct map as a 2M page. > > > > > > Generally, the unmapped allocations are intended for use-cases that anyway > > > map the memory elsewhere than direct map and need to modify direct mappings > > > of the memory, be it modules_alloc(), secretmem, PKS page tables or maybe > > > even some of the encrypted VM memory. > > > > I believe we are still not on the same page here. I do understand that > > you want to re-use the caching capability of the unmapped_pages_alloc > > for modules allocations as well. My question is whether module_alloc > > really benefits from the fact that the memory is unmapped? > > > > I guess you want to say that it does because it wouldn't have to change > > the permission for the direct map but I do not see that anywhere in the > > patch... > > This happens automagically outside the patch :) > > Currently, to change memory permissions modules code calls set_memory APIs > and passes vmalloced address to them. set_memory functions lookup the > direct map alias and update the permissions there as well. > If the memory allocated with module_alloc() is unmapped in the direct map, > there won't be an alias for set_memory APIs to update. > > > Also consinder the slow path where module_alloc needs to > > allocate a fresh (huge)page and unmap it. Does the overhead of the > > unmapping suits needs of all module_alloc users? Module loader is likely > > not interesting as it tends to be rather static but what about BPF > > programs? > > The overhead of unmapping pages in the direct map on allocation path will > be offset by reduced overhead of updating permissions in the direct map > after the allocation. Both are using the same APIs and if today the > permission update causes a split of a large page, unmapping of a large page > won't. > > Of course in a loaded system unmapped_alloc() won't be able to always > allocated large pages to replenish the cache, but still there will be fewer > updates to the direct map. Ok, all of that is a changelog material. I would recommend to go this way. Start by the simple unmapped_pages_alloc interface and use it for the secret memory. There shouldn't be anything controversial there. In a follow up patch add a support for the vmalloc which would add a new gfp flag with a justification that this is the simplest way to support module_alloc usecase and do not feel shy of providing as much context as you can. Ideally with some numbers for the best/worst/avg cases. Thanks -- Michal Hocko SUSE Labs