Re: [PATCH] mm/gup: restore the ability to pin more than 2GB at a time

Vlastimil Babka <vbabka@xxxxxxx> · Wed, 30 Oct 2024 12:03:10 +0100

On 10/30/24 05:39, John Hubbard wrote:
> On 10/29/24 9:33 PM, Christoph Hellwig wrote:
>> On Tue, Oct 29, 2024 at 09:30:41PM -0700, John Hubbard wrote:
>>> I do, yes. And what happens is that when you use GPUs, drivers like
>>> to pin system memory, and then point the GPU page tables to that
>>> memory. For older GPUs that don't support replayable page faults,
>>> that's required.
>>>
>>> So this behavior has been around forever.
>>>
>>> The customer was qualifying their software and noticed that before
>>> Linux 6.10, they could allocate >2GB, and with 6.11, they could
>>> not.
>>>
>>> Whether it is "wise" for user space to allocate that much at once
>>> is a reasonable question, but at least one place is (or was!) doing
>>> it.
>> 
>> Still missing a callchain, which make me suspect that it is your weird
>> out of tree driver, in which case this simply does not matter.
>> 
> 
> I expect I could piece together something with Nouveau, given enough
> time and help from Ben Skeggs and Danillo and all...
> 
> Yes, this originated with the out of tree driver. But it never occurred
> to me that upstream be uninterested in an obvious fix to an obvious
> regression.

It might be a regression even if you don't try to pin over 2GB. high-order
(>costly order) allocations can fail and/or cause disruptive
reclaim/compaction cycles even below MAX_PAGE_ORDER and it's better to use
kvmalloc if physical contiguity is not needed, it will attempt the physical
kmalloc() allocation with __GFP_NORETRY (little disruption) and fallback to
vmalloc() quickly.

Of course if there's a way to avoid the allocation completely, even beter.

> 
> thanks,