On Thu, May 18, 2023 at 8:24 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote: > > On Wed, May 17, 2023 at 11:35:56PM -0400, Kent Overstreet wrote: > > On Wed, Mar 08, 2023 at 11:41:02AM +0200, Mike Rapoport wrote: > > > From: "Mike Rapoport (IBM)" <rppt@xxxxxxxxxx> > > > > > > When set_memory or set_direct_map APIs used to change attribute or > > > permissions for chunks of several pages, the large PMD that maps these > > > pages in the direct map must be split. Fragmenting the direct map in such > > > manner causes TLB pressure and, eventually, performance degradation. > > > > > > To avoid excessive direct map fragmentation, add ability to allocate > > > "unmapped" pages with __GFP_UNMAPPED flag that will cause removal of the > > > allocated pages from the direct map and use a cache of the unmapped pages. > > > > > > This cache is replenished with higher order pages with preference for > > > PMD_SIZE pages when possible so that there will be fewer splits of large > > > pages in the direct map. > > > > > > The cache is implemented as a buddy allocator, so it can serve high order > > > allocations of unmapped pages. > > > > So I'm late to this discussion, I stumbled in because of my own run in > > with executable memory allocation. > > > > I understand that post LSF this patchset seems to not be going anywhere, > > but OTOH there's also been a desire for better executable memory > > allocation; as noted by tglx and elsewhere, there _is_ a definite > > performance impact on page size with kernel text - I've seen numbers in > > the multiple single digit percentage range in the past. > > > > This patchset does seem to me to be roughly the right approach for that, > > and coupled with the slab allocator for sub-page sized allocations it > > seems there's the potential for getting a nice interface that spans the > > full range of allocation sizes, from small bpf/trampoline allocations up > > to modules. > > > > Is this patchset worth reviving/continuing with? Was it really just the > > needed module refactoring that was the blocker? > > As I see it, this patchset only one building block out of three? four? > If we are to repurpose it for code allocations it should be something like > > 1) allocate 2M page to fill the cache > 2) remove this page from the direct map > 3) map the 2M page ROX in module address space (usually some part of > vmalloc address space) > 4) allocate a smaller chunk of that page to the actual caller (bpf, > modules, whatever) > > Right now (3) and (4) won't work for modules because they mix code and data > in a single allocation. I am working on patches based on the discussion in [1]. I am planning to send v1 for review in a week or so. Thanks, Song [1] https://lore.kernel.org/linux-mm/20221107223921.3451913-1-song@xxxxxxxxxx/ [...]