Hi Stefan, Stefan Priebe - Profihost AG writes:
While using kernel 4.19.55 and cgroupv2 i set a MemoryHigh value for a varnish service. It happens that the varnish.service cgroup reaches it's MemoryHigh value and stops working due to throttling.
In that kernel version, the only throttling we have is reclaim-based throttling (I also have a patch out to do schedule-based throttling, but it's not in mainline yet). If the application is slowing down, it likely means that we are struggling to reclaim pages.
But i don't understand is that the process itself only consumes 40% of it's cgroup usage. So the other 60% is dirty dentries and inode cache. If i issue an echo 3 > /proc/sys/vm/drop_caches the varnish cgroup memory usage drops to the 50% of the pure process.
As a caching server, doesn't Varnish have a lot of hot inodes/dentries in memory? If they are hot, it's possible it's hard for us to evict them.
I thought that the kernel would trigger automatic memory reclaim if a cgroup reaches is memory high value to drop caches.
It does, that's the throttling you're seeing :-) I think more information is needed to work out what's going on here. For example: what do your kswapd counters look like? What does "stops working due to throttling" mean -- are you stuck in reclaim?
Thanks, Chris