On Tue, Mar 14, 2023 at 10:50:36PM +0800, Yin, Fengwei wrote: > On 3/14/2023 5:48 PM, Matthew Wilcox wrote: > > On Tue, Mar 14, 2023 at 10:16:09AM +0100, David Hildenbrand wrote: > >> Just curious what the last sentence implies. Large folios are supposed to be > >> a transparent optimization. So why should we pageout all surrounding > >> subpages simply because a single subpage was requested to be paged out? That > >> might harm performance of some workloads ... more than the actual split. > >> > >> So it's not immediately obvious to me why "avoid splitting" is the correct > >> answer to the problem at hand. > > > > At least for anonymous pages, using large folios is an attempt to treat > > all pages in a particular range the same way. If the user says to only > > page out some of them, that's a big clue that these pages are different > > from the other pages, and so we should split a folio where the madvise > > call does not cover every page in the folio. > > Yes. This is my understanding also. :). > > > I'm less convinced that argument holds for page cache pages. > > Can you explain more about this? My understanding is that if we need > to reclaim the large folio for page cache, it's better to reclaim the > whole folio. Pagecache is a shared resource. To determine how best to handle all the memory used to cache a file (ie the correct folio size), ideally we would take into account how all the users of a particular file are using it. If we just listen to the most recent advice from one user, we risk making a decision that's bad for potentially many other users. Of course, we don't have any framework for deciding the correct folio size used for pagecache yet. We have the initial guess based on readahead and we have various paths that will split back to individual pages. But it's something I know we'll want to do at some point.