> Am 06.11.2019 um 13:16 schrieb Michal Hocko <mhocko@xxxxxxxxxx>: > > I didn't have time to read through newer versions of this patch series > but I remember there were concerns about this functionality being pulled > into the page allocator previously both by me and Mel [1][2]. Have those been > addressed? I do not see an ack from Mel or any other MM people. Is there > really a consensus that we want something like that living in the > allocator? I don‘t think there is. The discussion is still ongoing (although quiet, Nitesh is working on a new version AFAIK). I think we should not rush this. > > There has also been a different approach discussed and from [3] > (referenced by the cover letter) I can only see > > : Then Nitesh's solution had changed to the bitmap approach[7]. However it > : has been pointed out that this solution doesn't deal with sparse memory, > : hotplug, and various other issues. > > which looks more like something to be done than a fundamental > roadblocks. I second that. As I stated a couple of times already, it is totally fine to not support all environments initially (hotunplug, sparse zones). The major difference I am interested in is performance comparison. Then we have to decide if the gain in performance is worth core buddy modifications. > > [1] http://lkml.kernel.org/r/20190912163525.GV2739@xxxxxxxxxxxxxxxxxxx > [2] http://lkml.kernel.org/r/20190912091925.GM4023@xxxxxxxxxxxxxx > [3] http://lkml.kernel.org/r/29f43d5796feed0dec8e8bb98b187d9dac03b900.camel@xxxxxxxxxxxxxxx > >> On Tue 05-11-19 16:05:47, Andrew Morton wrote: >> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> >> Subject: mm: introduce Reported pages >> >> In order to pave the way for free page reporting in virtualized >> environments we will need a way to get pages out of the free lists and >> identify those pages after they have been returned. To accomplish this, >> this patch adds the concept of a Reported Buddy, which is essentially >> meant to just be the Uptodate flag used in conjunction with the Buddy page >> type. >> >> It adds a set of pointers we shall call "reported_boundary" which >> represent the upper boundary between the unreported and reported pages. >> The general idea is that in order for a page to cross from one side of the >> boundary to the other it will need to verify that it went through the >> reporting process. Ultimately a free list has been fully processed when >> the boundary has been moved from the tail all they way up to occupying the >> first entry in the list. Without this we would have to manually walk the >> entire page list until we have find a page that hasn't been reported. In >> my testing this adds as much as 18% additional overhead which would make >> this unattractive as a solution. >> >> One limitation to this approach is that it is essentially a linear search >> and in the case of the free lists we can have pages added to either the >> head or the tail of the list. In order to place limits on this we only >> allow pages to be added before the reported_boundary instead of adding to >> the tail itself. An added advantage to this approach is that we should be >> reducing the overall memory footprint of the guest as it will be more >> likely to recycle warm pages versus trying to allocate the reported pages >> that were likely evicted from the guest memory. >> >> Since we will only be reporting one zone at a time we keep the boundary >> limited to being defined for just the zone we are currently reporting >> pages from. Doing this we can keep the number of additional pointers >> needed quite small. To flag that the boundaries are in place we use a >> single bit in the zone to indicate that reporting and the boundaries are >> active. >> >> We store the index of the boundary pointer used to track the reported page >> in the page->index value. Doing this we can avoid unnecessary computation >> to determine the index value again. There should be no issues with this >> as the value is unused when the page is in the buddy allocator, and is >> reset as soon as the page is removed from the free list. >> >> Link: http://lkml.kernel.org/r/20191105220219.15144.69031.stgit@localhost.localdomain >> Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> >> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> >> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> >> Cc: David Hildenbrand <david@xxxxxxxxxx> >> Cc: <konrad.wilk@xxxxxxxxxx> >> Cc: Luiz Capitulino <lcapitulino@xxxxxxxxxx> >> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> >> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> >> Cc: Michael S. Tsirkin <mst@xxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxxxx> >> Cc: Oscar Salvador <osalvador@xxxxxxx> >> Cc: Pankaj Gupta <pagupta@xxxxxxxxxx> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> Cc: Rik van Riel <riel@xxxxxxxxxxx> >> Cc: Vlastimil Babka <vbabka@xxxxxxx> >> Cc: Wei Wang <wei.w.wang@xxxxxxxxx> >> Cc: Yang Zhang <yang.zhang.wz@xxxxxxxxx> >> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > -- > Michal Hocko > SUSE Labs