On 6/1/22 14:14, David Hildenbrand wrote: > On 31.05.22 19:33, Matthew Wilcox wrote: >> On Tue, May 31, 2022 at 07:15:14PM +0200, David Hildenbrand wrote: >>> On 31.05.22 17:06, Matthew Wilcox (Oracle) wrote: >>>> Since slab does not use the page refcount, it can allocate and >>>> free frozen pages, saving one atomic operation per free. >>> >>> I assume that implies that pages that are actually allocated *from* the >>> buddy now have a refcount == 0. >> >> Yes. >> >>> IIRC, page isolation code (e.g., !page_ref_count check in >>> has_unmovable_pages()) assumes that any page with a refcount of 0 is >>> essentially either already free (buddy) or is on its way of getting >>> freed (!buddy). >> >> That would be a bad assumption. We already freeze pages for things like >> splitting, migration, and replacement with a THP. If the usage is just >> an optimisation, then that's OK (and maybe the optimisation needs to be >> tweaked to check PageSlab), but if the code depends on that being true, >> it was already broken. > > Yes, it's an optimization to not go ahead and mess with the migratetype > of pageblocks (especially, setting it MIGRATE_ISOLATE to later reset it > to MIGRATE_MOVABLE) and so forth when it's obvious that there is > unmovable data. > > However, freezing the refcount -- as used in existing code -- is only a > temporary thing. And it's frequently used on movable pages. Agreed. > If migration froze the refcount, the page is most certainly movable. > THPs should be movable, so it doesn't matter if we froze the refcount or > not. Pages that we collapse into a THP should be mostly LRU pages and, > therefore, movable. > > So the frozen refcount doesn't result in additional "false negatives" in > e.g., has_unmovable_pages(), at least for these users. > > >> >> For this particular case, I think has_unmovable_pages() is only called >> for ZONE_MOVEABLE and Slab never allocates from ZONE_MOVEABLE, so it's >> not an issue? > > > has_unmovable_pages() is primarily useful for ZONE_NORMAL. For > ZONE_MOVABLE, we really only check if there are any boot time > allocations (PageReserved()). > > For example, alloc_contig_range() implicitly calls has_unmovable_pages() > when isolating pageblocks and ends up getting called on ZONE_NORMAL for > example for runtime allocation of gigantic pages or via virtio-mem > (nowadays even in pageblock granularity). > > > Long story short, we should document accordingly what it actually means > when we allocate without increasing the refcount, and that people should > be careful with that. Regarding has_unmovable_pages() and frozen > allocations, we might just be able to keep the existing detection > unmodified by special-casing PageSlab() pages and detecting them as > unmovable immediately. I wonder if it makes sense to encourage long-term freezing anyway, if it complicates other checks that could for now rely on the relatively simple rules. For slab it might save some atomics, but allocating/freeing a new slab page is already a very slow path, so that will be negligible. So I wouldn't mind if only up to 4/6 of this series was merged. OTOH once the API is available somebody will eventually use it for a long-term frozen allocation (but they achieve do that now too, so maybe not), so agree about documenting the implications.