On Fri, Dec 06, 2024 at 04:28:17PM +0000, Matthew Wilcox wrote: > 4. Introduce an indirection structure between the page and vm_struct which > contains the refcount. I'm starting to really warm up to this one. There are a number of places that we allocate "some pages", but want to treat them as a single object, not just vmalloc. Let's call this a 'scamem', short for "scattered memory". But this is going to be challenging. Assuming we want to support GUP, we need to be able to go from page->scamem [1]. In the skinniest version of shrinking struct page, we have just 8 bytes per page, and we need to both store a pointer to the scamem and store information like node, zone, section for _each_ page. We don't need to worry about this for folios/slabs/... because all pages in the folio have the same node/zone/section, so we can store this information once in the folio and then copy it back to the page on free. We can't do that for scamem without a (potentially large) allocation. And even if we do something like: struct scamem { unsigned int nr; refcount_t refcount; unsigned long flags[]; }; to be able to implement page_to_nid() on a page, we'd have to figure out which page within the scamem this was. So either we have to give up on our dream of an 8 byte memdesc, or figure out some other way to do this. So what if we store the scamem pointer in vma->vm_file->private_data, or vma->vm_private_data. That would let us keep the node/section/zone in the struct page. GUP has the VMA, so this can work. Yet another possibility would be if we can look up the page's pfn in some data structure and reconstruct the zone/section/node information at freeing time. I don't fully understand the meaning of this information, so I have no idea if this is possible. My current thought is: struct scamem { unsigned int nr; refcount_t refcount; struct page *pages[]; }; and changing vm_struct: - struct page **pages; + struct scamem *scamem; (I don't think we want to embed it in vm_struct, since we want vm_struct to have one refcount on scamem, and for the scamem to be freed once its refcount reaches zero rather than freed as part of vm_struct)