On 07/07/2023 09:01, Huang, Ying wrote: > Ryan Roberts <ryan.roberts@xxxxxxx> writes: > >> Introduce FLEXIBLE_THP feature, which allows anonymous memory to be >> allocated in large folios of a specified order. All pages of the large >> folio are pte-mapped during the same page fault, significantly reducing >> the number of page faults. The number of per-page operations (e.g. ref >> counting, rmap management lru list management) are also significantly >> reduced since those ops now become per-folio. > > I likes the idea to share as much code as possible between large > (anonymous) folio and THP. Finally, THP becomes just a special kind of > large folio. > > Although we can use smaller page order for FLEXIBLE_THP, it's hard to > avoid internal fragmentation completely. So, I think that finally we > will need to provide a mechanism for the users to opt out, e.g., > something like "always madvise never" via > /sys/kernel/mm/transparent_hugepage/enabled. I'm not sure whether it's > a good idea to reuse the existing interface of THP. I wouldn't want to tie this to the existing interface, simply because that implies that we would want to follow the "always" and "madvise" advice too; That means that on a thp=madvise system (which is certainly the case for android and other client systems) we would have to disable large anon folios for VMAs that haven't explicitly opted in. That breaks the intention that this should be an invisible performance boost. I think it's important to set the policy for use of THP separately to use of large anon folios. I could be persuaded on the merrits of a new runtime enable/disable interface if there is concensus. > > Best Regards, > Huang, Ying