Hi Michal, Chris, Well, tar made me unhappy, it just collected list of files but not content from /sys/fs/cgroup/... But if I set memory.max = memory.high reclaim seems to work and memory pressure remains zero for the cg. If I set memory.max = $((memory.high + 128M)) memory pressure rises immediately (when memory.current ~= memory.high). Returning to memory.max=memory.high gets things running again and memory pressure starts dropping immediately. Could it be that the wrong limit of high/max is being used for reclaim? Bruno On Fri, 10 Apr 2020 09:15:25 +0200 Bruno Prémont <bonbons@xxxxxxxxxxxxxxxxx> wrote: > Hi Michal, > > On Thu, 9 Apr 2020 17:25:40 Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > Your earlier stat snapshot doesn't indicate a big problem with the > > reclaim though: > > > > memory.stat:pgscan 47519855 > > memory.stat:pgsteal 44933838 > > > > This tells the overall reclaim effectiveness was 94%. Could you try to > > gather snapshots with a 1s granularity starting before your run your > > backup to see how those numbers evolve? Ideally with timestamps to > > compare with the actual stall information. > > Attached is a long collection of > date memory.current memory.stat[pgscan] memory.stat[pgsteal] > > It started while backup was running +/- smoothly with its memory.high > set to 4294967296 (4G instead of 2G) until backup finished around 20:22. > > From system memory pressure RRD-graph I see pressure (around 60) > between about 19:50 to 20:10 while very small the rest of the time > (below 1). > > > > I started a new backup run this morning grabbing full info snapshots of > backup cgroup at 1s interval in order to get a better/more complete > picture and CG's memory.high back to 2G limit. > > > I have the impression as if reclaim was somehow triggered not enough or > not strongly enough compared to the IO performed within the CG > (complete backup covers 130G of data, data being read in blocks of > 128kB at a smooth-running rate of ~7MiB/s). > > > Another option would be to enable vmscan tracepoints but let's try with > > stats first. > > > Bruno -- Bruno Prémont <bruno.premont@xxxxxxxxxx> Ingénieur système et développements Fondation RESTENA 2, avenue de l'Université L-4365 Esch/Alzette Tél: (+352) 424409 Fax: (+352) 422473 https://www.restena.lu https://www.dns.lu