Re: [PATCH] mm/gup: restore the ability to pin more than 2GB at a time

John Hubbard <jhubbard@xxxxxxxxxx> · Wed, 30 Oct 2024 17:17:25 -0700

On 10/30/24 5:02 PM, Jason Gunthorpe wrote:
On Wed, Oct 30, 2024 at 11:34:49AM -0700, John Hubbard wrote:

 From a very high level design perspective, it's not yet clear to me
that there is either a "preferred" or "not recommended" aspect to
pinning in batches vs. all at once here, as long as one stays
below the type (int, long, unsigned...) limits of the API. Batching
seems like what you do if the internal implementation is crippled
and unable to meet its API requirements. So the fact that many
callers do batching is sort of "tail wags dog".

No.. all things need to do batching because nothing should be storing
a linear struct page array that is so enormous. That is going to
create vmemap pressure that is not desirable.

Are we talking about the same allocation size here? It's not 2GB. It
is enough folio pointers to cover 2GB of memory, so 4MB.

That's not really much pressure.

For instance rdma pins in batches and copies the pins into a scatter
list and never has an allocation over PAGE_SIZE.

iommufd transfers them into a radix tree.

It is not so much that there is a limit, but that good kernel code
just *shouldn't* be allocating gigantic contiguous memory arrays at
all.

That high level guidance makes sense, but here we are attempting only
a 4MB physically contiguous allocation, and if larger than that, then
it goes to vmalloc() which is merely virtually contiguous.

I'm writing this because your adjectives make me suspect that you
are referring to a 2GB allocation. But this is orders of magnitude
smaller.

thanks,
--
John Hubbard