On 11/6/19 7:16 AM, Michal Hocko wrote: > I didn't have time to read through newer versions of this patch series > but I remember there were concerns about this functionality being pulled > into the page allocator previously both by me and Mel [1][2]. Have those been > addressed? I do not see an ack from Mel or any other MM people. Is there > really a consensus that we want something like that living in the > allocator? > > There has also been a different approach discussed and from [3] > (referenced by the cover letter) I can only see > > : Then Nitesh's solution had changed to the bitmap approach[7]. However it > : has been pointed out that this solution doesn't deal with sparse memory, > : hotplug, and various other issues. > > which looks more like something to be done than a fundamental > roadblocks. I agree. I think the major issue with my series would be the performance drop which we are observing specifically with (MAX_ORDER - 2) as the minimum reporting order. That is something I am trying to investigate and see if I can fix it. With (MAX_ORDER - 1), I was able to get a similar performance which Alexander has reported. > > [1] http://lkml.kernel.org/r/20190912163525.GV2739@xxxxxxxxxxxxxxxxxxx > [2] http://lkml.kernel.org/r/20190912091925.GM4023@xxxxxxxxxxxxxx > [3] http://lkml.kernel.org/r/29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@xxxxxxxxxxxxxxx > > On Tue 05-11-19 16:05:47, Andrew Morton wrote: >> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> >> Subject: mm: introduce Reported pages >> >> In order to pave the way for free page reporting in virtualized >> environments we will need a way to get pages out of the free lists and >> identify those pages after they have been returned. To accomplish this, >> this patch adds the concept of a Reported Buddy, which is essentially >> meant to just be the Uptodate flag used in conjunction with the Buddy page >> type. >> >> It adds a set of pointers we shall call "reported_boundary" which >> represent the upper boundary between the unreported and reported pages. >> The general idea is that in order for a page to cross from one side of the >> boundary to the other it will need to verify that it went through the >> reporting process. Ultimately a free list has been fully processed when >> the boundary has been moved from the tail all they way up to occupying the >> first entry in the list. Without this we would have to manually walk the >> entire page list until we have find a page that hasn't been reported. In >> my testing this adds as much as 18% additional overhead which would make >> this unattractive as a solution. >> >> One limitation to this approach is that it is essentially a linear search >> and in the case of the free lists we can have pages added to either the >> head or the tail of the list. In order to place limits on this we only >> allow pages to be added before the reported_boundary instead of adding to >> the tail itself. An added advantage to this approach is that we should be >> reducing the overall memory footprint of the guest as it will be more >> likely to recycle warm pages versus trying to allocate the reported pages >> that were likely evicted from the guest memory. >> >> Since we will only be reporting one zone at a time we keep the boundary >> limited to being defined for just the zone we are currently reporting >> pages from. Doing this we can keep the number of additional pointers >> needed quite small. To flag that the boundaries are in place we use a >> single bit in the zone to indicate that reporting and the boundaries are >> active. >> >> We store the index of the boundary pointer used to track the reported page >> in the page->index value. Doing this we can avoid unnecessary computation >> to determine the index value again. There should be no issues with this >> as the value is unused when the page is in the buddy allocator, and is >> reset as soon as the page is removed from the free list. >> >> Link: http://lkml.kernel.org/r/20191105220219.15144.69031.stgit@localhost.localdomain >> Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> >> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> >> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> >> Cc: David Hildenbrand <david@xxxxxxxxxx> >> Cc: <konrad.wilk@xxxxxxxxxx> >> Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx> >> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> >> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxxxx> >> Cc: Oscar Salvador <osalvador@xxxxxxx> >> Cc: Pankaj Gupta <pagupta@xxxxxxxxxx> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Rik van Riel <riel@xxxxxxxxxxx> >> Cc: Vlastimil Babka <vbabka@xxxxxxx> >> Cc: Wei Wang <wei.w.wang@xxxxxxxxx> >> Cc: Yang Zhang <yang.zhang.wz@xxxxxxxxx> >> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> -- Thanks Nitesh