On Thu, Nov 14, 2024 at 12:18:15PM +0100, David Hildenbrand wrote: > I'm currently staring again at PageOffline and wonder how we could prepare > it for the memdesc future, and if we can remove refcount handling. Thanks for bringing it up. As a memdesc, I currently have PageOffline as being type 0 (Misc), subtype 5 (Offline). That's bits 0-10 and then bit 11 is for "may be mapped to userspace". Bits 12-17 are the order. With the top bits being used for section/node/zone, that could be 25 + 12 + 3 = 40 bits, so we'd have 7 bits remaining for use as flags. > I'd like to stop using the refcount for PageOffline pages, and keep the > refcount always at 0. I think this makes sense. > But the refcount, it is currently used to detect whether we are allowed to > offline memory blocks that contain PageOffline pages, because only selected > drivers support re-onlining. Well, and it is used when returning the pages > to the buddy where free_page()/free_contig_range().... expect a refcount of > 1. > > Further, virtio-mem currently uses the PageDirty() bit to remember if a > PageOffline page was already exposed to the buddy before, or if we must use > generic_online_page(). > > For now we would need the following information, that could be stored in 2 > flags, leaving the refcount at 0: > > (1) Was it obtained from the buddy or never exposed it to the buddy > > PageOffline() && PageOfflineNeverOnlined() > > (2) The driver does support actual memory offlining+reonlining, they can > be skipped when offlining. > > PageOffline() && PageOfflineSkippable > > > But when allocating/freeing pages we would still mess with the refcount, > which is bad. > > We could have a dedicated interface for freeing them, where we abstract the > generic_online_page() bits, and leave the refcount at 0: > > free_offline_page() > free_offline_page_range() > > And > > alloc_offline_page() > alloc_offline_page_range() > alloc_offline_pages > > I'm not super happy about the "alloc/free" terminology, but nothing better > came to mind. If I resurrect https://lore.kernel.org/linux-mm/20220809171854.3725722-1-willy@xxxxxxxxxxxxx/ would the frozen terminology work for you here? > There is one complication to sort out: balloon_compaction.h supports moving > PageOffline pages, and seems to use the page lock, page refcount, page lru, > page private... which is all rather nasty. I wonder if these should get > their own page type, like PageMovableOffline, and we'd mostly leave them > alone for now. This would mean that virtio-balloon, vmware-balloon and ppc > CMM would keep doing the old refcount-based thing but with a new page type. It's fairly clear to me now that we have a sane story for moving file/anon folios. The current way we handle movable pages looks mostly insane because it's hammered into that framework, I think we need something entirely different to handle movable non-folio pages, but I don't know what that story is yet.