On Wed 29-08-18 17:13:59, Marinko Catovic wrote: > > > > trace data which starts _before_ the cache dropdown starts and while it > > is decreasing should be the first step. Ideally along with /proc/vmstat > > gathered at the same time. I am pretty sure you have some high order > > memory consumer which forces the reclaim and we over reclaim. Last data > > was not really conclusive as it didn't really captured the dropdown > > IIRC. > > > > with before you mean in a totally healthy state? yep > as I can not tell when decreasing starts this would mean collecting data > over days perhaps. however, I have no issue with that. yeah, you can pipe the trace buffer to gzip and reduce the output considerably. > As I do not want to miss anything that might help you, could you please > provide the commands for all the data you require? Use the same set of commands for tracing I have provided earlier + add the compresssion cat /debug/trace/trace_pipe | gzip > file.gz + the loop to gather vmstat while true do cp /proc/vmstat vmstat.$(date +%s) sleep 5s done > one host is at a healthy state right now, I'd run that over there immediately. Let's see what we can get from here. -- Michal Hocko SUSE Labs