Re: [LSF/MM TOPIC] Eliminating tail pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 11, 2019 at 11:09:08AM -0800, Matthew Wilcox wrote:
> 
> I can't follow simple instructions.
> 
> ----- Forwarded message from Matthew Wilcox <willy@xxxxxxxxxxxxx> -----
> 
> Date: Mon, 11 Feb 2019 11:07:28 -0800
> From: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> To: lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx
> Subject: [LSF/MM TOPIC] Eliminating tail pages
> User-Agent: Mutt/1.9.2 (2017-12-15)
> 
> 
> Tail pages are a pain.  All over the kernel, we call compound_head()
> (or occasionally forget to ...).  So what would it take to eliminate them?
> 
> I'm doing my best to eliminate them from being stored in the page cache.
> That's a nice first step, but the very first thing that functions like
> find_get_entry(), find_get_entries(), et al do is convert any large
> page they find to a tail page.  So we'll probably need to introduce new
> functions which will return head pages and convert users over to them.
> I know Kirill has a lot more experience with this.
> 
> Another place where we return tail pages is get_user_pages().  Callers of
> get_user_pages() expect tail or small pages; they do things like calculate
> the offset of the byte within the page by AND with PAGE_MASK.  There'll be
> a lot of work to check all the users and convert them to something like
> 
> unsigned int page_offset(struct page *page, unsigned long addr);
> 
> Another thing to consider is that some architectures have a third-level
> page size of 16GB (looking at you, POWER).  So an unsigned int isn't
> going to cut it.  Do we want to support pages that large, or do we declare
> that there will never be any point in supporting pages larger than 4GB?
> 
> There are probably other pitfalls I'm forgetting or have never known.

Another place where we see tail pages is on plain page walk: we do map
compund pages with PTEs: THP after split_huge_pmd() or simillar. Some
drivers also allocate compound pages that can be mmaped into userspace
with PTE. I saw sound subsystem do this.

-- 
 Kirill A. Shutemov



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux