Re: Explanation for difference between memcg swap accounting and smaps_rollup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Benjamin,

On Fri, Feb 25, 2022 at 05:10:05PM +0100, Benjamin Berg wrote:
> Hi,
> 
> I am seeing memory.swap.current usages for the gnome-shell cgroup that
> seem high if I compare them to smaps_rollup for the contained
> processes. As I don't have an explanation, I thought I would ask here
> (shared memory?).
> 
> What I am seeing is (see below, after a tail /dev/zero):
> 
> memory.swap.current:
>   686MiB
> "Swap" lines from /proc/$pid/smaps_rollup added up:
>   435MiB
> 
> We should be moving launched applications out of the shell cgroup
> before doing execve(), so I think we can rule out that as a possible
> explanation.
>
> I am mostly curious as we currently do swap based kills using systemd-
> oomd. So if swap accounting for GNOME Shell is high, then it makes it a
> more likely target unfortunately.

Shared memory is one option. For example, when you access tmpfs files
with open() read() write() close() instead of mmap().

Another option is swapcache. When swap space is plentiful, the kernel
makes it hold on to copies of pages even after they've been swapped
back in. This way, the next time they need to get "swapped out", it
doesn't require any IO, it can just drop the in-memory copy. From an
smaps POV, swapped in pages are Rss, not Swap. But their swap copies
still contribute to memory.swap.current, hence the discrepancy.

In terms of OOM killing, the kernel will stop keeping swap copies
around when more than half of swap space is used. That should give
plenty of headroom toward the OOM killing thresholds.

If you want to poke around on your machine, here is a drgn script that
tallies up the cache-only swap entries:

---
#!/usr/bin/drgn

MAX_SWAPFILES=25
SWAP_HAS_CACHE=0x40

swapcache=0
for i in range(MAX_SWAPFILES):
    si = prog['swap_info'][i]
    if si:
        for offset in range(si.max.value_()):
            if si.swap_map[offset].value_() == SWAP_HAS_CACHE:
                swapcache += 1
print("Cache-only swap space: %.2fM" % (swapcache * 4 / 1024.0))



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux