On 12/19/23 23:35, Mina Almasry wrote:
On Tue, Dec 19, 2023 at 1:04 PM David Wei <dw@xxxxxxxxxxx> wrote:
From: Pavel Begunkov <asml.silence@xxxxxxxxx>
NOT FOR UPSTREAM
The final version will depend on how the ppiov infra looks like
Page pool is tracking how many pages were allocated and returned, which
serves for refcounting the pool, and so every page/frag allocated should
eventually come back to the page pool via appropriate ways, e.g. by
calling page_pool_put_page().
When it comes to normal page pools (i.e. without memory providers
attached), it's fine to return a page when it's still refcounted by
somewhat in the stack, in which case we'll "detach" the page from the
pool and rely on page refcount for it to return back to the kernel.
Memory providers are different, at least ppiov based ones, they need
all their buffers to eventually return back, so apart from custom pp
->release handlers, we'll catch when someone puts down a ppiov and call
its memory provider to handle it, i.e. __page_pool_iov_free().
The first problem is that __page_pool_iov_free() hard coded devmem
handling, and other providers need a flexible way to specify their own
callbacks.
The second problem is that it doesn't go through the generic page pool
paths and so can't do the mentioned pp accounting right. And we can't
even safely rely on page_pool_put_page() to be called somewhere before
to do the pp refcounting, because then the page pool might get destroyed
and ppiov->pp would point to garbage.
The solution is to make the pp ->release callback to be responsible for
properly recycling its buffers, e.g. calling what was
__page_pool_iov_free() before in case of devmem.
page_pool_iov_put_many() will be returning buffers to the page pool.
Hmm this patch is working on top of slightly outdated code. I think> the correct solution here is to transition to using pp_ref_count for
refcounting the ppiovs/niovs. Once we do that, we no longer need
special refcounting for ppiovs, they're refcounted identically to
pages, makes the pp more maintainable, gives us some unified handling
of page pool refcounting, it becomes trivial to support fragmented
pages which require a pp_ref_count, and all the code in this patch can
go away.
I'm unsure if this patch is just because you haven't rebased to my
latest RFC (which is completely fine by me), or if you actually think
using pp_ref_count for refcounting is wrong and want us to go back to
the older model which required some custom handling for ppiov and
disabled frag support. I'm guessing it's the former, but please
correct if I'm wrong.
Right, it's based on older patches, it'd be a fool's work keep
rebasing it while the code is still changing unless there is a
good reason for that.
I haven't taken a look at devmem v5, I definitely going to. IMHO,
this approach is versatile and clear, but if there is a better one,
I'm all for it.
--
Pavel Begunkov