On 30.10.24 07:50, John Hubbard wrote:
On 10/29/24 11:18 PM, Alistair Popple wrote:
John Hubbard <jhubbard@xxxxxxxxxx> writes:
On 10/29/24 9:42 PM, Christoph Hellwig wrote:
On Tue, Oct 29, 2024 at 09:39:15PM -0700, John Hubbard wrote:
...
Because pinning down these amounts of memoryt is completely insane.
I don't mind the switch to kvmalloc, but we need to put in an upper
bound of what can be pinned.
I'm wondering though, how it is that we decide how much of the user's
system we prevent them from using? :) People with hardware accelerators
do not always have page fault capability, and yet these troublesome
users insist on stacking their system full of DRAM and then pointing
the accelerator to it.
How would we choose a value? Memory sizes keep going up...
The obvious answer is you let users decide. I did have a patch series to
do that via a cgroup[1]. However I dropped that series mostly because I
couldn't find any users of such a limit to provide feedback on how they
would use it or how they wanted it to work.
Trawling through the discussion there, I see that Jason Gunthorpe mentioned:
"Things like VFIO & KVM use cases effectively pin 90% of all system memory"
The unusual thing is not the amount of system memory we are pinning but
*how many* pages we try pinning in the single call.
If you stare at vfio_pin_pages_remote, we seem to be batching it.
long req_pages = min_t(long, npage, batch->capacity);
Which is
#define VFIO_BATCH_MAX_CAPACITY (PAGE_SIZE / sizeof(struct page *))
So you can fix this in your driver ;)
We should maybe try a similar limit internally: if you call
pin_user_pages_remote() with a large number, we'll cap it at some magic
value (similar to above). The caller will simply realize that not all
pages were pinned and will retry.
See get_user_pages_remote(): "Returns either number of pages pinned
(which may be less than the number requested), or an error. Details
about the return value:"
Alternatively, I recall there was a way to avoid the temporary
allocation ... let me hack up a prototype real quick.
--
Cheers,
David / dhildenb