On Fri, 2024-10-25 at 00:50 -0700, Yosry Ahmed wrote: > On Thu, Oct 24, 2024 at 11:41 PM Konstantin Kharlamov > <Hi-Angel@xxxxxxxxx> wrote: > > > > On Thu, 2024-10-24 at 13:47 -0700, Yosry Ahmed wrote: > > > On Thu, Oct 24, 2024 at 6:02 AM Konstantin Kharlamov > > > <Hi-Angel@xxxxxxxxx> wrote: > > > > > > > > When ZSWAP is disabled, the `Zswap` and `Zswapped` in meminfo > > > > are > > > > still non-zero. > > > > IOW, ZSWAP doesn't free memory upon being disabled. > > > > > > > > Stumbled upon this while trying to figure out where did ≈4G of > > > > my > > > > SWAP memory > > > > disappear. Been seeing some unknown memory in SWAP for years, > > > > now I > > > > suspect ZSWAP > > > > might be the culprit. But no way to know for sure because of > > > > this > > > > bug. > > > > > > > > # Steps to reproduce > > > > > > > > 1. Enable ZSWAP > > > > 2. Wait for `grep Zswap /proc/meminfo` to become non-zero > > > > 3. Disable ZSWAP via `sudo sh -c "echo 0 > > > > > /sys/module/zswap/parameters/enabled"` > > > > 4. Look at `grep Zswap /proc/meminfo` > > > > > > > > ## Expected > > > > > > > > The rows are zero because ZSWAP is disabled. > > > > > > Not really, the expected behavior is that further swapouts will > > > not > > > go > > > to zswap, but pages that are already compressed in zswap will not > > > be > > > written out to the backing swapfile or swapped back to memory. A > > > swapoff would be required for the latter. > > > > > > This is documented in: > > > https://docs.kernel.org/admin-guide/mm/zswap.html#overview. > > > > Oh, I see, thank you, sorry for the noise. > > > > Then, I'm curious, is it correct to assume that this `Zswap`- > > prefixed > > memory mentioned in meminfo is never the one that is in SWAP? I > > mean, > > Zswap being a buffer before data goes to swap kind of implies that > > yes, > > the data *either* in zswap or in swap. But just wanted to hear that > > explicitly. > > I know this makes sense, but unfortunately no. Zswap is currently > transparent to the rest of the system. For all intents and purposes, > pages in zswap are considered in swap. You cannot even use zswap with > an actual swapfile. So the zswap stats should be a subset of the swap > stats. > > FWIW, Nhat is working on restructuring this to have zswap be its own > entity, separate from any swapfiles. > > > > > The background to my question is that I'm trying to find the > > culprit > > some "phantom memory" eventually filling up my SWAP. This memory is > > not > > one accounted to apps (as calculated via `smem`), nor to tmpfs. So > > my > > next suspect was something related to ZSwap. > > > > > As I mentioned, zswap should be transparent to the rest of the > system, > so it shouldn't make a difference in this case whether the pages are > in zswap or in the swapfile. > > You can use the memory.swap.current counter to find out which memory > cgroup currently has swapped out pages (in zswap or in the swapfile). > This should help find the application that has memory in swap. If you > want to find the exact type of memory (e.g. anon vs tmpfs), that > would > be more tricky. Perhaps you can swapoff and see what counters > increase > in memory.stat of the relevant memory cgroup? Thank you, so, I've waited till my SWAP gets almost full again (apparently my new workflow triggers that a lot). It is 7.5G out of 8 in total. 437M is taken by tmpfs'es, let's subtract for simplicity, so I have 7G taken by something else. Now I'm looking at `/sys/fs/cgroup/user.slice/memory.swap.current` and it's 4422422528 = 4.1G. That's a lot less than 7G. I'm certain this "phantom swap memory" is hidden in `user.slice`, because if I wait till OOM-killer gets triggered and kills some app, my user-systemd gets crashed for some reason, taking down the entire user session, and afterwards SWAP is almost free. I think this memory.swap.current isn't much different compared to just asking `smem` for SWAP taken by individual apps. As of writing the words that's 4.6G for the entire system, as calculated by: sudo smem -c "name user pid vss pss rss swap" | awk '{total+=$7} END {print "Swap memory: " total "K"}' So 7 - 4.6 = 2.4G of some "phantom" memory.