On 10/29/24 9:33 PM, Christoph Hellwig wrote:
On Tue, Oct 29, 2024 at 09:30:41PM -0700, John Hubbard wrote:
I do, yes. And what happens is that when you use GPUs, drivers like
to pin system memory, and then point the GPU page tables to that
memory. For older GPUs that don't support replayable page faults,
that's required.
So this behavior has been around forever.
The customer was qualifying their software and noticed that before
Linux 6.10, they could allocate >2GB, and with 6.11, they could
not.
Whether it is "wise" for user space to allocate that much at once
is a reasonable question, but at least one place is (or was!) doing
it.
Still missing a callchain, which make me suspect that it is your weird
out of tree driver, in which case this simply does not matter.
I expect I could piece together something with Nouveau, given enough
time and help from Ben Skeggs and Danillo and all...
Yes, this originated with the out of tree driver. But it never occurred
to me that upstream be uninterested in an obvious fix to an obvious
regression.
thanks,
--
John Hubbard