On 8/21/18 8:49 AM, Michal Hocko wrote: > On Tue 21-08-18 02:36:05, Marinko Catovic wrote: > [...] >>>> Well, there are some drivers (mostly out-of-tree) which are high order >>>> hungry. You can try to trace all allocations which with order > 0 and >>>> see who that might be. >>>> # mount -t tracefs none /debug/trace/ >>>> # echo stacktrace > /debug/trace/trace_options >>>> # echo "order>0" > /debug/trace/events/kmem/mm_page_alloc/filter >>>> # echo 1 > /debug/trace/events/kmem/mm_page_alloc/enable >>>> # cat /debug/trace/trace_pipe >>>> >>>> And later this to disable tracing. >>>> # echo 0 > /debug/trace/events/kmem/mm_page_alloc/enable >>> >>> I just had a major cache-useless situation, with like 100M/8G usage only >>> and horrible performance. There you go: >>> >>> https://nofile.io/f/mmwVedaTFsd > > $ grep mm_page_alloc: trace_pipe | sed 's@.*order=\([0-9]*\) .*gfp_flags=\(.*\)@\1 \2@' | sort | uniq -c > 428 1 __GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE > 10 1 __GFP_HIGH|__GFP_ATOMIC|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE > 6 1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE > 3061 1 GFP_KERNEL_ACCOUNT|__GFP_ZERO > 8672 1 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ACCOUNT > 2547 1 __GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE > 4 2 __GFP_HIGH|__GFP_ATOMIC|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE > 5 2 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE > 20030 2 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ACCOUNT > 1528 3 GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC > 2476 3 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP > 6512 3 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ACCOUNT > 277 9 GFP_TRANSHUGE|__GFP_THISNODE > > This only covers ~90s of the allocator activity. Most of those requests > are not troggering any reclaim (GFP_NOWAIT/ATOMIC). Vlastimil will > know better but this might mean that we are not envoking kcompactd > enough. Earlier vmstat data showed that it's invoked but responsible for less than 1% of compaction activity. > But considering that we have suspected that an overly eager > reclaim triggers the page cache reduction I am not really sure I see the > above to match that theory. Yeah, the GFP_NOWAIT/GFP_ATOMIC above shouldn't be responsible for such overreclaim? > Btw. I was probably not specific enough. This data should be collected > _during_ the time when the page cache is disappearing. I suspect you > have started collecting after the fact. It might be also interesting to do in the problematic state, instead of dropping caches: - save snapshot of /proc/vmstat and /proc/pagetypeinfo - echo 1 > /proc/sys/vm/compact_memory - save new snapshot of /proc/vmstat and /proc/pagetypeinfo That would show if compaction is able to help at all. > Btw. vast majority of order-3 requests come from the network layer. Are > you using a large MTU (jumbo packets)? >