On Thu, 2025-01-09 at 12:22 +0100, David Hildenbrand wrote: > On 07.01.25 19:07, Shakeel Butt wrote: > > On Tue, Jan 07, 2025 at 09:34:49AM +0100, David Hildenbrand wrote: > > > On 06.01.25 19:17, Shakeel Butt wrote: > > > > On Mon, Jan 06, 2025 at 11:19:42AM +0100, Miklos Szeredi wrote: > > > > > On Fri, 3 Jan 2025 at 21:31, David Hildenbrand <david@xxxxxxxxxx> wrote: > > > > > > In any case, having movable pages be turned unmovable due to persistent > > > > > > writaback is something that must be fixed, not worked around. Likely a > > > > > > good topic for LSF/MM. > > > > > > > > > > Yes, this seems a good cross fs-mm topic. > > > > > > > > > > So the issue discussed here is that movable pages used for fuse > > > > > page-cache cause a problems when memory needs to be compacted. The > > > > > problem is either that > > > > > > > > > > - the page is skipped, leaving the physical memory block unmovable > > > > > > > > > > - the compaction is blocked for an unbounded time > > > > > > > > > > While the new AS_WRITEBACK_INDETERMINATE could potentially make things > > > > > worse, the same thing happens on readahead, since the new page can be > > > > > locked for an indeterminate amount of time, which can also block > > > > > compaction, right? > > > > > > Yes, as memory hotplug + virtio-mem maintainer my bigger concern is these > > > pages residing in ZONE_MOVABLE / MIGRATE_CMA areas where there *must not be > > > unmovable pages ever*. Not triggered by an untrusted source, not triggered > > > by an trusted source. > > > > > > It's a violation of core-mm principles. > > > > The "must not be unmovable pages ever" is a very strong statement and we > > are violating it today and will keep violating it in future. Any > > page/folio under lock or writeback or have reference taken or have been > > isolated from their LRU is unmovable (most of the time for small period > > of time). > > ^ this: "small period of time" is what I meant. > > Most of these things are known to not be problematic: retrying a couple > of times makes it work, that's why migration keeps retrying. > > Again, as an example, we allow short-term O_DIRECT but disallow > long-term page pinning. I think there were concerns at some point if > O_DIRECT might also be problematic (I/O might take a while), but so far > it was not a problem in practice that would make CMA allocations easily > fail. > > vmsplice() is a known problem, because it behaves like O_DIRECT but > actually triggers long-term pinning; IIRC David Howells has this on his > todo list to fix. [I recall that seccomp disallows vmsplice by default > right now] > > These operations are being done all over the place in kernel. > > Miklos gave an example of readahead. > > I assume you mean "unmovable for a short time", correct, or can you > point me at that specific example; I think I missed that. > > > The per-CPU LRU caches are another > > case where folios can get stuck for long period of time. > > Which is why memory offlining disables the lru cache. See > lru_cache_disable(). Other users that care about that drain the LRU on > all cpus. > > > Reclaim and > > compaction can isolate a lot of folios that they need to have > > too_many_isolated() checks. So, "must not be unmovable pages ever" is > > impractical. > > "must only be short-term unmovable", better? > Still a little ambiguous. How short is "short-term"? Are we talking milliseconds or minutes? Imposing a hard timeout on writeback requests to unprivileged FUSE servers might give us a better guarantee of forward-progress, but it would probably have to be on the order of at least a minute or so to be workable. > > > > The point is that, yes we should aim to improve things but in iterations > > and "must not be unmovable pages ever" is not something we can achieve > > in one step. > > I agree with the "improve things in iterations", but as > AS_WRITEBACK_INDETERMINATE has the FOLL_LONGTERM smell to it, I think we > are making things worse. > > And as this discussion has been going on for too long, to summarize my > point: there exist conditions where pages are short-term unmovable, and > possibly some to be fixed that turn pages long-term unmovable (e.g., > vmsplice); that does not mean that we can freely add new conditions that > turn movable pages unmovable long-term or even forever. > > Again, this might be a good LSF/MM topic. If I would have the capacity I > would suggest a topic around which things are know to cause pages to be > short-term or long-term unmovable/unsplittable, and which can be > handled, which not. Maybe I'll find the time to propose that as a topic. > This does sound like great LSF/MM fodder! I predict that this session will run long! ;) -- Jeff Layton <jlayton@xxxxxxxxxx>