Re: pageless memory & zsmalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 07, 2021 at 05:03:12PM +0200, Vlastimil Babka wrote:
> On 10/5/21 19:51, Matthew Wilcox wrote:
> > We're trying to tidy up the mess in struct page, and as part of removing
> > slab from struct page, zsmalloc came on my radar because it's using some
> > of slab's fields.  The eventual endgame is to get struct page down to a
> > single word which points to the "memory descriptor" (ie the current
> > zspage).
> > 
> > zsmalloc, like vmalloc, allocates order-0 pages.  Unlike vmalloc,
> > zsmalloc allows compaction.  Currently (from the file):
> > 
> >  * Usage of struct page fields:
> >  *      page->private: points to zspage
> >  *      page->freelist(index): links together all component pages of a zspage
> >  *              For the huge page, this is always 0, so we use this field
> >  *              to store handle.
> >  *      page->units: first object offset in a subpage of zspage
> >  *
> >  * Usage of struct page flags:
> >  *      PG_private: identifies the first component page
> >  *      PG_owner_priv_1: identifies the huge component page
> > 
> > This isn't quite everything.  For compaction, zsmalloc also uses
> > page->mapping (set in __SetPageMovable()), PG_lock (to sync with
> > compaction) and page->_refcount (compaction gets a refcount on the page).
> > 
> > Since zsmalloc is so well-contained, I propose we completely stop
> > using struct page in it, as we intend to do for the rest of the users
> > of struct page.  That is, the _only_ element of struct page we use is
> > compound_head and it points to struct zspage.
> > 
> > That means every single page allocated by zsmalloc is PageTail().  Also it
> 
> I would be worried there is code, i.e. some pfn scanner that will see a
> PageTail, lookup its compound_head() and order and use it to skip over the
> rest of tail pages. Which would fail spectacularly if compound_head()
> pointed somewhere else than to the same memmap array to a struct page.

Yes, that's definitely a concern.  What does work is the pfn scanner
doing pfn |= (1 << page_order(page)) - 1; (because page_order(zspage)
is 0, so this is a noop).  It's something that will need to be audited
before we do this.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux