On Fri 20-12-24 19:52:16, Yafang Shao wrote: > On Fri, Dec 20, 2024 at 6:23 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Sun 15-12-24 15:34:13, Yafang Shao wrote: > > > Implementation Options > > > ---------------------- > > > > > > - Solution A: Allow file caches on the unevictable list to become > > > reclaimable. > > > This approach would require significant refactoring of the page reclaim > > > logic. > > > > > > - Solution B: Prevent file caches from being moved to the unevictable list > > > during mlock and ignore the VM_LOCKED flag during page reclaim. > > > This is a more straightforward solution and is the one we have chosen. > > > If the file caches are reclaimed from the download-proxy's memcg and > > > subsequently accessed by tasks in the application’s memcg, a filemap > > > fault will occur. A new file cache will be faulted in, charged to the > > > application’s memcg, and locked there. > > > > Both options are silently breaking userspace because a non failing mlock > > doesn't give guarantees it is supposed to AFAICS. > > It does not bypass the mlock mechanism; rather, it defers the actual > locking operation to the page fault path. Could you clarify what you > mean by "a non-failing mlock"? From what I can see, mlock can indeed > fail if there isn’t sufficient memory available. With this change, we > are simply shifting the potential failure point to the page fault path > instead. Your change will cause mlocked pages (as mlock syscall returns success) to be reclaimable later on. That breaks the basic mlock contract. -- Michal Hocko SUSE Labs