Re: [RFC PATCH net-next] page_pool: Track DMA-mapped pages and unmap them when destroying the pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2025/3/10 23:42, Matthew Wilcox wrote:
> On Mon, Mar 10, 2025 at 10:13:32AM +0100, Toke Høiland-Jørgensen wrote:
>> Yunsheng Lin <yunshenglin0825@xxxxxxxxx> writes:
>>> Also, Using the more space in 'struct page' for the page_pool seems to
>>> make page_pool more coupled to the mm subsystem, which seems to not
>>> align with the folios work that is trying to decouple non-mm subsystem
>>> from the mm subsystem by avoid other subsystem using more of the 'struct
>>> page' as metadata from the long term point of view.
>>
>> This seems a bit theoretical; any future changes of struct page would
>> have to shuffle things around so we still have the ID available,
>> obviously :)
> 
> See https://kernelnewbies.org/MatthewWilcox/Memdescs
> and more immediately
> https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
> 
> pagepool is going to be renamed "bump" because it's a bump allocator and
> "pagepool" is a nonsense name.  I haven't looked into it in a lot of
> detail yet, but in the not-too-distant future, struct page will look
> like this (from your point of view):
> 
> struct page {
> 	unsigned long flags;
> 	unsigned long memdesc;

It seems there may be memory behind the above 'memdesc' with different size
and layout for different subsystem?

I am not sure if I understand the case of the same page might be handle in
two subsystems concurrently or a page is allocated in one subsystem and
then passed to be handled in other subsystem, for examlpe:
page_pool owned page is mmap'ed into user space through tcp zero copy,
see tcp_zerocopy_vm_insert_batch(), it seems the same page is handled in
both networking/page_pool and vm subsystem?

And page->mapping seems to have been moved into 'memdesc' as there is no
'mapping' field in 'struct page' you list here? Does we need a similar
field like 'mapping' in the 'memdesc' for page_pool subsystem to support
tcp zero copy?

> 	int _refcount;	// 0 for bump
> 	union {
> 		unsigned long private;
> 		atomic_t _mapcount; // maybe used by bump?  not sure
> 	};
> };
> 
> 'memdesc' will be a pointer to struct bump with the bottom four bits of
> that pointer indicating that it's a struct bump pointer (and not, say, a
> folio or a slab).

The above seems similar as what I was doing, the difference seems to be
that memory behind the above pointer is managed by page_pool itself
instead of mm subsystem allocating 'memdesc' memory from a slab cache?

> 
> So if you allocate a multi-page bump, you'll get N of these pages,
> and they'll all point to the same struct bump where you'll maintain
> your actual refcount.  And you'll be able to grow struct bump to your
> heart's content.  I don't know exactly what struct bump looks like,
> but the core mm will have no requirements on you.
> 




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux