Re: [PATCH net-next 04/12] mm: Make the page_frag_cache allocator use multipage folios

Mika Penttilä <mpenttil@xxxxxxxxxx> · Fri, 26 May 2023 17:06:55 +0300

Hi,

On 26.5.2023 15.47, David Howells wrote:
Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:

Change the page_frag_cache allocator to use multipage folios rather than
groups of pages.  This reduces page_frag_free to just a folio_put() or
put_page().

put_page() is not used in this patch, perhaps remove it to avoid
the confusion?

Will do if I need to respin the patches.

Also, Is there any significant difference between __free_pages()
and folio_put()? IOW, what does the 'reduces' part means here?

I meant that the folio code handles page compounding for us and we don't need
to work out how big the page is for ourselves.

If you look at __free_pages(), you can see a PageHead() call.  folio_put()
doesn't need that.

I followed some disscusion about folio before, but have not really
understood about real difference between 'multipage folios' and
'groups of pages' yet. Is folio mostly used to avoid the confusion
about whether a page is 'headpage of compound page', 'base page' or
'tailpage of compound page'? Or is there any abvious benefit about
folio that I missed?

There is a benefit: a folio pointer always points to the head page and so we
never need to do "is this compound? where's the head?" logic to find it.  When
going from a page pointer, we still have to find the head.



But page_frag_free() uses folio_put(virt_to_folio(addr)) and 
virt_to_folio() depends on the compound infrastructure to get the head 
page and folio.


Ultimately, the aim is to reduce struct page to a typed pointer to massively
reduce the amount of space consumed by mem_map[].  A page struct will then
point at a folio or a slab struct or one of a number of different types.  But
to get to that point, we have to stop a whole lot of things from using page
structs, but rather use some other type, such as folio.

Eventually, there won't be a need for head pages and tail pages per se - just
memory objects of different sizes.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 306a3d1a0fa6..d7c52a5979cc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio)
  }
  
  struct page_frag_cache {
-	void * va;
-#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
-	__u16 offset;
-	__u16 size;
-#else
-	__u32 offset;
-#endif
+	struct folio	*folio;
+	unsigned int	offset;
  	/* we maintain a pagecount bias, so that we dont dirty cache line
  	 * containing page->_refcount every time we allocate a fragment.
  	 */
-	unsigned int		pagecnt_bias;
-	bool pfmemalloc;
+	unsigned int	pagecnt_bias;
+	bool		pfmemalloc;
  };

It seems 'va' and 'size' field is used to avoid touching 'stuct page' to
avoid possible cache bouncing when there is more frag can be allocated
from the page while other frags is freed at the same time before this patch?

Hmmm... fair point, though va is calculated from the page pointer on most
arches without the need to dereference struct page (only arc, m68k and sparc
define WANT_PAGE_VIRTUAL).

David


--Mika