On 10/5/21 19:51, Matthew Wilcox wrote: > We're trying to tidy up the mess in struct page, and as part of removing > slab from struct page, zsmalloc came on my radar because it's using some > of slab's fields. The eventual endgame is to get struct page down to a > single word which points to the "memory descriptor" (ie the current > zspage). > > zsmalloc, like vmalloc, allocates order-0 pages. Unlike vmalloc, > zsmalloc allows compaction. Currently (from the file): > > * Usage of struct page fields: > * page->private: points to zspage > * page->freelist(index): links together all component pages of a zspage > * For the huge page, this is always 0, so we use this field > * to store handle. > * page->units: first object offset in a subpage of zspage > * > * Usage of struct page flags: > * PG_private: identifies the first component page > * PG_owner_priv_1: identifies the huge component page > > This isn't quite everything. For compaction, zsmalloc also uses > page->mapping (set in __SetPageMovable()), PG_lock (to sync with > compaction) and page->_refcount (compaction gets a refcount on the page). > > Since zsmalloc is so well-contained, I propose we completely stop > using struct page in it, as we intend to do for the rest of the users > of struct page. That is, the _only_ element of struct page we use is > compound_head and it points to struct zspage. > > That means every single page allocated by zsmalloc is PageTail(). Also it I would be worried there is code, i.e. some pfn scanner that will see a PageTail, lookup its compound_head() and order and use it to skip over the rest of tail pages. Which would fail spectacularly if compound_head() pointed somewhere else than to the same memmap array to a struct page. > means that when isolate_movable_page() calls trylock_page(), it redirects > to the zspage. That means struct zspage must now have page flags as its > first element. Also, zspage->_refcount, and zspage->mapping must match > their locations in struct page. That's something that we'll get cleaned > up eventually, but for now, we're relying on offsetof() assertions. > > The good news is that trylock_zspage() no longer needs to walk the > list of pages, calling trylock_page() on each of them. > > Anyway, is there a good test suite for zsmalloc()? Particularly something > that would exercise its interactions with compaction / migration? > I don't have any code written yet. >