Re: [RFC PATCH 03/26] mm: make pageblock_order 2M per default

Johannes Weiner <hannes@xxxxxxxxxxx> · Tue, 18 Apr 2023 23:44:04 -0400

On Tue, Apr 18, 2023 at 10:55:53PM -0400, Johannes Weiner wrote:
> On Wed, Apr 19, 2023 at 03:01:05AM +0300, Kirill A. Shutemov wrote:
> > On Tue, Apr 18, 2023 at 03:12:50PM -0400, Johannes Weiner wrote:
> > > pageblock_order can be of various sizes, depending on configuration,
> > > but the default is MAX_ORDER-1.
> > 
> > Note that MAX_ORDER got redefined in -mm tree recently.
> > 
> > > Given 4k pages, that comes out to
> > > 4M. This is a large chunk for the allocator/reclaim/compaction to try
> > > to keep grouped per migratetype. It's also unnecessary as the majority
> > > of higher order allocations - THP and slab - are smaller than that.
> > 
> > This seems way to x86-specific.
> > Other arches have larger THP sizes. I believe 16M is common.
> >
> > Maybe define it as min(MAX_ORDER, PMD_ORDER)?
> 
> Hm, let me play around with larger pageblocks.
> 
> The thing that gives me pause is that this seems quite aggressive as a
> default block size for the allocator and reclaim/compaction - if you
> consider the implications for internal fragmentation and the amount of
> ongoing defragmentation work it would require.
> 
> IOW, it's not just a function of physical page size supported by the
> CPU. It's also a function of overall memory capacity. Independent of
> architecture, 2MB seems like a more reasonable step up than 16M.

[ Quick addition: on those other archs, these patches would still help
  with other, non-THP sources of compound allocations, such as slub,
  variable-order cache folios, and really any orders up to 2M. So it's
  not like we *have* to raise it to PMD_ORDER for them to benefit. ]