On 11/12/19 1:34 PM, Alexander Duyck wrote: > On Tue, 2019-11-12 at 14:04 +0100, David Hildenbrand wrote: >>>>> fact is it is still invasive, just to different parts of the mm subsystem. >>>> I'd love to see how it uses the page isolation framework, and only has a >>>> single hook to queue pages. I don't like the way pages are pulled out of >>>> the buddy in Niteshs approach currently. What you have is cleaner. >>> I don't see how you could use the page isolation framework to pull out >>> free pages. Is there a thread somewhere on the topic that I missed? >> It's basically only isolating pages while reporting them, and not >> pulling them out of the buddy (IOW, you move the pages to the isolate >> queues where nobody is allowed to touch them, and setting the >> migratetype properly). This e.g., makes other user of page isolation >> (e.g., memory offlining, alloc_contig_range()) play nicely with these >> isolated pages. "somebody else just isolated them, please try again." > How so? If I understand correctly there isn't anything that prevents you > from isolating an already isolated page, is there? Last I knew isolated > pages are still considered "movable" since they are still buddy pages > aren't they? > > Also this seems like it would have other implications since isolating a > page kicks of the memory notifier so as a result a balloon driver would > then free the pages back out so that they could be isolated with the > assumption the region is going offline. > >> start_isolate_page_range()/undo_isolate_page_range()/test_pages_isolated() >> along with a lockless check if the page is free. > Okay, that part I think I get. However doesn't all that logic more or less > ignore the watermarks? It seems like you could cause an OOM if you don't > have the necessary checks in place for that. > >> I think it should be something like this (ignoring different >> migratetypes and such for now) >> >> 1. Test lockless if page is free: Not free? Done. > So this should help to reduce the liklihood of races in the steps below. > However it might also be useful if the code had some other check to see if > it was done other than just making a pass through the bitmap. > > One thing I had brought up with Nitesh was the idea of maybe doing some > sort of RCU bitmap type approach. Basically while we hold the zone lock we > could swap out the old bitmap for a new one. We could probably even keep a > counter at the start of the structure so that we could track how many bits > are actually set there. Then it becomes less likely of having a race where > you free a page and set the bit and the hinting thread tests and clears > the bit but doesn't see the freed page since it is not synchronized. > Otherwise your notification setup and reporting thread may need a few smp > barriers added where necessary. > > One generic request. I would appreciate it if you guys can keep me in the cc while discussing. Otherwise, with the amount of discussion its quite easy to go out of sync. -- Thanks Nitesh