On 7 Jan 2025, at 11:55, David Hildenbrand wrote: > On 07.01.25 17:48, Zi Yan wrote: >> On 7 Jan 2025, at 11:11, David Hildenbrand wrote: >> >>> Hi, >>> >>> one item on my todo list is making PageOffline pages to stop using "struct page" members except page->type and 1/2 flags, to prepare them for the memdesc future, to avoid unnecessary atomics, and to resolve some (so-far) theoretical issues with temporary speculative references. >>> >>> For example, the page->_refcount will always be 0 (frozen) for PageOffline pages, and they will get allocated/freed similar to how we allocate/free frozen pages for slab already. Once we move the refcount into "struct folio", they will not have a refcount at all anymore. >>> >>> One complication is balloon compaction: we allow for migrating PageOffline pages allocated in some memory ballooning implementations such as virtio-balloon. >>> >>> For that, we use the "non-lru page migration" framework and in that process we make use of ... way to many members of "struct page"/"struct folio" and rely on the refcount not being 0. For example, we certainly don't want to allocate memdescs for PageOffline pages just so some of them can be migrated. >> >> Then first thing is to make all get_new_folio functions be aware of PageOffline >> pages and be able to allocate a PageOffline page. IIUC, the current process >> is: 1) allocate a page from buddy allocator, 2) offline the new page during >> mops->migrate_page() and online the old page. The inflation and deflation >> in step 2 looks redundant if migrate_pages() can get PageOffline pages to >> begin with and put_page() can handle PageOffline page too. > > That might be one hacky way of handling offline pages, yes :) > > (the isolation step is tricky: for example, with page->lru gone we cannot even put these things into a list! Also, there is page isolation ...) > > I recall that the isolation step is required because we could have multiple parties trying to migrate the same page at the same time. So that must be handled as well. OK, since page->lru is gone, migrate_pages() might not be suitable for these pages, unless we want to rewrite migrate_pages(), which might be desirable. :) Then, we could record PFNs instead, like what migrate_vma*() does, but I have not checked migrate_vma*() in details to tell the feasibility yet. In terms of isolation, we can use PageIsolated flag and make sure it is in the remaining 1/2 flags. This flag can be used for other non-folio things too. > >> >>> >>> While we converted non-lru page migration to work on folios (i.e., folio_movable_ops()) these things are not actually "folios" in the future, they can have different memdescs. >>> >>> So, how can we migrate non-lru things that are not folios while not relying on "struct folio" members, with minimal/no metadata overhead? >> >> Like I said above, if migrate_pages() is aware of PageOffline pages by allocating >> and putting them like normal folios, that could work. >> >> Or you can do what hugetlb migration does, adding a separate migrate_offlinepages() >> function to handle PageOffline pages. This probably can save you a lot of >> LRU page checks like mapping and locks, but it adds a special function. So >> tradeoffs. >> >>> >>> I have some ideas, but no complete solution yet; input about the requirements of other non-lru page migration use cases besides PageOffline will be interesting. >>> >>> ... and maybe, we have other non-folio things we'd want to migrate, and want to be prepared to handle them as well? (hint: leaf page tables?) >> >> If we have dedicated allocator for non-folio things and make migrate_pages() >> be aware of them, it should be doable. > > Note that I thought about similar things as you describe above, but part of the exercise will not be focusing on PageOffline pages, but having something more generic that can handle pages with actual page content, and that have to be properly isolated :) Sure. IMHO, we will need dedicated allocation and free functions for these non-folio things, PageIsolated flag for isolation, a dedicated code path in migrate_pages() or migrate_vma*(). Best Regards, Yan, Zi