On 2025/3/10 23:42, Matthew Wilcox wrote: > On Mon, Mar 10, 2025 at 10:13:32AM +0100, Toke Høiland-Jørgensen wrote: >> Yunsheng Lin <yunshenglin0825@xxxxxxxxx> writes: >>> Also, Using the more space in 'struct page' for the page_pool seems to >>> make page_pool more coupled to the mm subsystem, which seems to not >>> align with the folios work that is trying to decouple non-mm subsystem >>> from the mm subsystem by avoid other subsystem using more of the 'struct >>> page' as metadata from the long term point of view. >> >> This seems a bit theoretical; any future changes of struct page would >> have to shuffle things around so we still have the ID available, >> obviously :) > > See https://kernelnewbies.org/MatthewWilcox/Memdescs > and more immediately > https://kernelnewbies.org/MatthewWilcox/Memdescs/Path > > pagepool is going to be renamed "bump" because it's a bump allocator and > "pagepool" is a nonsense name. I haven't looked into it in a lot of > detail yet, but in the not-too-distant future, struct page will look > like this (from your point of view): > > struct page { > unsigned long flags; > unsigned long memdesc; It seems there may be memory behind the above 'memdesc' with different size and layout for different subsystem? I am not sure if I understand the case of the same page might be handle in two subsystems concurrently or a page is allocated in one subsystem and then passed to be handled in other subsystem, for examlpe: page_pool owned page is mmap'ed into user space through tcp zero copy, see tcp_zerocopy_vm_insert_batch(), it seems the same page is handled in both networking/page_pool and vm subsystem? And page->mapping seems to have been moved into 'memdesc' as there is no 'mapping' field in 'struct page' you list here? Does we need a similar field like 'mapping' in the 'memdesc' for page_pool subsystem to support tcp zero copy? > int _refcount; // 0 for bump > union { > unsigned long private; > atomic_t _mapcount; // maybe used by bump? not sure > }; > }; > > 'memdesc' will be a pointer to struct bump with the bottom four bits of > that pointer indicating that it's a struct bump pointer (and not, say, a > folio or a slab). The above seems similar as what I was doing, the difference seems to be that memory behind the above pointer is managed by page_pool itself instead of mm subsystem allocating 'memdesc' memory from a slab cache? > > So if you allocate a multi-page bump, you'll get N of these pages, > and they'll all point to the same struct bump where you'll maintain > your actual refcount. And you'll be able to grow struct bump to your > heart's content. I don't know exactly what struct bump looks like, > but the core mm will have no requirements on you. >