On Sat, Jun 8, 2024 at 8:53 AM Takero Funaki <flintglass@xxxxxxxxx> wrote: > > This series addresses two issues and introduces a minor improvement in > zswap global shrinker: > > 1. Fix the memcg iteration logic that breaks iteration on offline memcgs. > 2. Fix the error path that aborts on expected error codes. > 3. Add proactive shrinking at 91% full, for 90% accept threshold. > Taking a step back from the correctness conversation, could you include in the changelog of the patches and cover letter a realistic scenario, along with user space-visible metrics that show (ideally all 4, but at least some of the following): 1. A user problem (that affects performance, or usability, etc.) is happening. 2. The root cause is what we are trying to fix (for e.g in patch 1, we are skipping over memcgs unnecessarily in the global shrinker loop). 3. The fix alleviates the root cause in b) 4. The userspace-visible problem goes away or is less serious. I have already hinted in a previous response, but global shrinker is rarely triggered in production. There are lots of factors that would prevent this from triggering: 1. Max zswap pool size 20% of memory by default, which is a lot. 2. Swapfile size also limits the size of the amount of data storable in the zswap pool. 3. Other cgroup constraints (memory.max, memory.zswap.max, memory.swap.max) also limit a cgroup's zswap usage. I do agree that patch 1 at least is fixing a problem, and probably patch 2 too but please justify why we are investing in the extra complexity to fix this problem in the first place.