Re: [PATCH] rmap: fix pgoff calculation to handle hugepage correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2 Jul 2014 00:30:57 -0400 Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> wrote:

> Subject: [PATCH v2] rmap: fix pgoff calculation to handle hugepage correctly
> 
> I triggered VM_BUG_ON() in vma_address() when I try to migrate an anonymous
> hugepage with mbind() in the kernel v3.16-rc3. This is because pgoff's
> calculation in rmap_walk_anon() fails to consider compound_order() only to
> have an incorrect value.
> 
> This patch introduces page_to_pgoff(), which gets the page's offset in
> PAGE_CACHE_SIZE. Kirill pointed out that page cache tree should natively
> handle hugepages, and in order to make hugetlbfs fit it, page->index of
> hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
> but page_to_pgoff() contains the point to be fixed in a single function.
> 
> ...
>
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -399,6 +399,18 @@ static inline struct page *read_mapping_page(struct address_space *mapping,
>  }
>  
>  /*
> + * Get the offset in PAGE_SIZE.
> + * (TODO: hugepage should have ->index in PAGE_SIZE)
> + */
> +static inline pgoff_t page_to_pgoff(struct page *page)
> +{
> +	if (unlikely(PageHeadHuge(page)))
> +		return page->index << compound_order(page);
> +	else
> +		return page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +}
> +

This is all a bit of a mess.

We have page_offset() which only works for regular pagecache pages and
not for huge pages.

We have page_file_offset() which works for regular pagecache as well
as swapcache but not for huge pages.

We have page_index() and page_file_index() which differ in undocumented
ways which I cannot be bothered working out.  The latter calls
__page_file_index() which is grossly misnamed.

Now we get a new page_to_pgoff() which in inconsistently named but has
a similarly crappy level of documentation and which works for hugepages
and regular pagecache pages but not for swapcache pages.


Sigh.

I'll merge this patch because it's a bugfix but could someone please
drive a truck through all this stuff and see if we can come up with
something tasteful and sane?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]