On Tue 24-09-19 08:20:22, Alexander Duyck wrote: > On Tue, Sep 24, 2019 at 7:23 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Wed 18-09-19 10:52:25, Alexander Duyck wrote: > > [...] > > > In order to try and keep the time needed to find a non-reported page to > > > a minimum we maintain a "reported_boundary" pointer. This pointer is used > > > by the get_unreported_pages iterator to determine at what point it should > > > resume searching for non-reported pages. In order to guarantee pages do > > > not get past the scan I have modified add_to_free_list_tail so that it > > > will not insert pages behind the reported_boundary. > > > > > > If another process needs to perform a massive manipulation of the free > > > list, such as compaction, it can either reset a given individual boundary > > > which will push the boundary back to the list_head, or it can clear the > > > bit indicating the zone is actively processing which will result in the > > > reporting process resetting all of the boundaries for a given zone. > > > > Is this any different from the previous version? The last review > > feedback (both from me and Mel) was that we are not happy to have an > > externally imposed constrains on how the page allocator is supposed to > > maintain its free lists. > > The main change for v10 versus v9 is that I allow the page reporting > boundary to be overridden. Specifically there are two approaches that > can be taken. > > The first is to simply reset the iterator for whatever list is > updated. What this will do is reset the iterator back to list_head and > then you can do whatever you want with that specific list. OK, this is slightly better than pushing the allocator to the corner. The allocator really has to be under control of its data structures. I would still be happier if the allocator wouldn't really have to bother about somebody snooping its internal state to do its own thing. So please make sure to describe why and how much this really matters. > The other option is to simply clear the ZONE_PAGE_REPORTING_ACTIVE > bit. That will essentially notify the page reporting code that any/all > hints that were recorded have been discarded and that it needs to > start over. > > All I am trying to do with this approach is reduce the work. Without > doing this the code has to walk the entire free page list for the > higher orders every iteration and that will not be cheap. How expensive this will be? > Admittedly > it is a bit more invasive than the cut/splice logic used in compaction > which is taking the pages it has already processed and moving them to > the other end of the list. However, I have reduced things so that we > only really are limiting where add_to_free_list_tail can place pages, > and we are having to check/push back the boundaries if a reported page > is removed from a free_list. > > > If this is really the only way to go forward then I would like to hear > > very convincing arguments about other approaches not being feasible. > > There are none in this cover letter unfortunately. This will be really a > > hard sell without them. > > So I had considered several different approaches. Thanks this is certainly useful and it would have been even more so if you gave some rough numbers to quantify how much overhead for different solutions we are talking about here. -- Michal Hocko SUSE Labs