On 10/30/24 05:39, John Hubbard wrote: > On 10/29/24 9:33 PM, Christoph Hellwig wrote: >> On Tue, Oct 29, 2024 at 09:30:41PM -0700, John Hubbard wrote: >>> I do, yes. And what happens is that when you use GPUs, drivers like >>> to pin system memory, and then point the GPU page tables to that >>> memory. For older GPUs that don't support replayable page faults, >>> that's required. >>> >>> So this behavior has been around forever. >>> >>> The customer was qualifying their software and noticed that before >>> Linux 6.10, they could allocate >2GB, and with 6.11, they could >>> not. >>> >>> Whether it is "wise" for user space to allocate that much at once >>> is a reasonable question, but at least one place is (or was!) doing >>> it. >> >> Still missing a callchain, which make me suspect that it is your weird >> out of tree driver, in which case this simply does not matter. >> > > I expect I could piece together something with Nouveau, given enough > time and help from Ben Skeggs and Danillo and all... > > Yes, this originated with the out of tree driver. But it never occurred > to me that upstream be uninterested in an obvious fix to an obvious > regression. It might be a regression even if you don't try to pin over 2GB. high-order (>costly order) allocations can fail and/or cause disruptive reclaim/compaction cycles even below MAX_PAGE_ORDER and it's better to use kvmalloc if physical contiguity is not needed, it will attempt the physical kmalloc() allocation with __GFP_NORETRY (little disruption) and fallback to vmalloc() quickly. Of course if there's a way to avoid the allocation completely, even beter. > > thanks,