On Sat, 2024-10-26 at 23:46 -0700, Yosry Ahmed wrote: > On Sat, Oct 26, 2024 at 8:14 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > > > On Sat, Oct 26, 2024 at 5:29 PM Konstantin Kharlamov > > <Hi-Angel@xxxxxxxxx> wrote: > > > > > > That was a good idea! The > > > `/sys/fs/cgroup/system.slice/memory.swap.current` seems to have > > > the > > > missing half of the SWAP memory. From my understanding of the > > > `systemctl status` graph `sytem.slice` and `user.slice` groups do > > > not > > > intersect, and by adding up `system.slice/…` + `user.slice/…` I > > > get > > > around 8G. > > > > > > However, I'm still unclear what does this memory belong to. > > > `system.slice/memory.swap.current` is 4.4G currently, that's a > > > lot and > > > I'm not seeing anything that could take so much memory. > > I am not very familiar with what usually runs in system.slice. > > > > > I assume you do not have any proactive memory reclaimer? :) I > > believe > > the top utility can display swap usage by process. Have you tried > > that? > > > > There are a couple of edge cases - for instance, if you disable > > zswap > > writeback and zswap at the same time. We will allocate slots on > > swapfile, and store it at the page table entry, but we cannot store > > the page's content in zswap or the swapfile, so the page remains in > > memory. You're occupying swap space, but are not really saving any > > memory usage. > > > > IIRC, there is also an edge case where a page is faulted back into > > memory from swap, but the associated swap space cannot be > > immediately > > released. This should be temporary though - memory reclaimer will > > attempt to release these pages later on, or they can be released > > when > > we scan the swapfile for slots during swap out. > > I don't think this is an edge case. I think when we swapin a page we > generally leave it in the swapcache if there is no pressure on swap > space. In that case the memory is not really swapped out, but because > it remains in the swapcache it is still reserving a swap slot, so it > shows up as swap usage. > > Konstantin, could you check the amount of swapcache you have, whether > through /proc/vmstat or memory.stat on both user and system slices? Sure λ grep cache /sys/fs/cgroup/*/memory.stat … /sys/fs/cgroup/system.slice/memory.stat:swapcached 434917376 /sys/fs/cgroup/user.slice/memory.stat:swapcached 15478784 `434917376` is a 0.4G, not much. In comparison, `system.slice/memory.swap.current` is currently `4764139520 = 4.4G`.