On Fri, 12 Oct 2018, Andrew Morton wrote: > > The slab allocator has a heuristic that checks whether the internal > > fragmentation is satisfactory and, if not, increases cachep->gfporder to > > try to improve this. > > > > If the amount of waste is the same at higher cachep->gfporder values, > > there is no significant benefit to allocating higher order memory. There > > will be fewer calls to the page allocator, but each call will require > > zone->lock and finding the page of best fit from the per-zone free areas. > > > > Instead, it is better to allocate order-0 memory if possible so that pages > > can be returned from the per-cpu pagesets (pcp). > > > > There are two reasons to prefer this over allocating high order memory: > > > > - allocating from the pcp lists does not require a per-zone lock, and > > > > - this reduces stranding of MIGRATE_UNMOVABLE pageblocks on pcp lists > > that increases slab fragmentation across a zone. > > Confused. Higher-order slab pages never go through the pcp lists, do > they? Nope. > I'd have thought that by tending to increase the amount of > order-0 pages which are used by slab, such stranding would be > *increased*? > These cpus have MIGRATE_UNMOVABLE pages on their pcp list. But because they are order-1 instead of order-0, we take zone->lock and find the smallest possible page in the zone's free area that is of sufficient size. On low on memory situations, there are no pages of MIGRATE_UNMOVABLE migratetype at any order in the free area. This calls __rmqueue_fallback() that steals pageblocks, MIGRATE_RECLAIMABLE and then MIGRATE_MOVABLE, and as MIGRATE_UNMOVABLE. We rely on the pcp batch count to backfill MIGRATE_UNMOVABLE pages onto the pcp list so we don't need to take zone->lock, and as a result of these allocations being order-0 rather than order-1 we can then allocate from these pages when such slab caches are expanded rather than stranding them. We noticed this when the amount of memory wasted for TCPv6 was the same for both order-0 and order-1 allocations (order-1 waste was two times the order-0 waste). We had hundreds of cpus with pages on their MIGRATE_UNMOVABLE pcp list, but while allocating order-1 memory it would prefer to happily steal other pageblocks before calling reclaim and draining pcp lists. > > We are particularly interested in the second point to eliminate cases > > where all other pages on a pageblock are movable (or free) and fallback to > > pageblocks of other migratetypes from the per-zone free areas causes > > high-order slab memory to be allocated from them rather than from free > > MIGRATE_UNMOVABLE pages on the pcp. > > > > mm/slab.c | 15 +++++++++++++++ > > Do slub and slob also suffer from this effect? > SLOB should not, SLUB will typically increase the order to improve performance of the cpu cache; there's a drawback to changing out the cpu cache that SLAB does not have. In the case that this patch is addressing, there is no greater memory utilization from the allocted slab pages.