On Mon, Feb 11, 2019 at 11:09:08AM -0800, Matthew Wilcox wrote: > > I can't follow simple instructions. > > ----- Forwarded message from Matthew Wilcox <willy@xxxxxxxxxxxxx> ----- > > Date: Mon, 11 Feb 2019 11:07:28 -0800 > From: Matthew Wilcox <willy@xxxxxxxxxxxxx> > To: lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx > Subject: [LSF/MM TOPIC] Eliminating tail pages > User-Agent: Mutt/1.9.2 (2017-12-15) > > > Tail pages are a pain. All over the kernel, we call compound_head() > (or occasionally forget to ...). So what would it take to eliminate them? > > I'm doing my best to eliminate them from being stored in the page cache. > That's a nice first step, but the very first thing that functions like > find_get_entry(), find_get_entries(), et al do is convert any large > page they find to a tail page. So we'll probably need to introduce new > functions which will return head pages and convert users over to them. > I know Kirill has a lot more experience with this. > > Another place where we return tail pages is get_user_pages(). Callers of > get_user_pages() expect tail or small pages; they do things like calculate > the offset of the byte within the page by AND with PAGE_MASK. There'll be > a lot of work to check all the users and convert them to something like > > unsigned int page_offset(struct page *page, unsigned long addr); > > Another thing to consider is that some architectures have a third-level > page size of 16GB (looking at you, POWER). So an unsigned int isn't > going to cut it. Do we want to support pages that large, or do we declare > that there will never be any point in supporting pages larger than 4GB? > > There are probably other pitfalls I'm forgetting or have never known. Another place where we see tail pages is on plain page walk: we do map compund pages with PTEs: THP after split_huge_pmd() or simillar. Some drivers also allocate compound pages that can be mmaped into userspace with PTE. I saw sound subsystem do this. -- Kirill A. Shutemov