On Thu 02-11-17 16:01:20, Michal Hocko wrote: > On Thu 02-11-17 22:15:06, Tetsuo Handa wrote: > > I was waiting for Michal's comment, but it seems that he is too busy now. > > Thus, I post non-authoritative comment here. (I'm not a tracepoints user.) > > yes, that is the case. Thanks for looking into this Tetsuo. > > > Two stalls were found in dmesg but only PID = 2798 part was recorded in the trace logs. > > > > [ 6109.502115] chrome: page allocation stalls for 10321ms, order:0, mode:0x14000d2(GFP_TEMPORARY|__GFP_HIGHMEM), nodemask=(null) > > [ 6109.502179] chrome cpuset=/ mems_allowed=0 > > [ 6109.502570] CPU: 0 PID: 2798 Comm: chrome Not tainted 4.13.9-300.fc27.x86_64+debug #1 > > > > I have only glanced through the trace data. > > > So, trying to analyze this one. > > > > Since 10 seconds of blank was found between mm_shrink_slab_start and > > mm_shrink_slab_end, this alone can cause stall warning messages. > > > > # tracer: nop > > # > > # _-----=> irqs-off > > # / _----=> need-resched > > # | / _---=> hardirq/softirq > > # || / _--=> preempt-depth > > # ||| / delay > > # TASK-PID CPU# |||| TIMESTAMP FUNCTION > > # | | | |||| | | > > chrome-2798 [000] .N.1 6099.188540: mm_shrink_slab_start: super_cache_scan+0x0/0x1b0 ffff8eefa4651830: nid: 0 objects to shrink 5895 gfp_flags GFP_TEMPORARY|__GFP_HIGHMEM pgs_scanned 90 lru_pgs 6992959 cache items 5049 delta 0 total_scan 2524 > > chrome-2798 [000] ...1 6109.494205: mm_shrink_slab_end: super_cache_scan+0x0/0x1b0 ffff8eefa4651830: nid: 0 unused scan count 5895 new scan count 941785 total_scan 476 last shrinker return val 1959 > > Yeah, the direct reclaim has started > chrome-2798 [000] ...1 6099.187991: mm_vmscan_direct_reclaim_begin: order=0 may_writepage=1 gfp_flags=GFP_TEMPORARY|__GFP_HIGHMEM classzone_idx=2 > and finished > chrome-2798 [000] ...1 6109.509445: mm_vmscan_direct_reclaim_end: nr_reclaimed=51 > > the only notable hole in logging was the one pointed by Tetsuo. There is > a lot of activity on that CPU during that time wrt. reclaim > $ grep -v '\-2798' trace.txt | grep '\[000\]' | awk '{val=$4+0; if (val > 6099 && val < 6109) print}' | wc -l > 744 > > And there were more processes involved > $ grep -v '\-2798' trace.txt | grep '\[000\]' | awk '{val=$4+0; if (val > 6099 && val < 6109) print $1}' | sort | uniq -c > 74 <...>-10654 > 43 <...>-13862 > 82 <...>-17624 > 2 <...>-27318 > 1 <...>-3518 > 37 <...>-5331 > 180 <...>-6602 > 38 chrome-3482 > 40 Chrome_IOThread-2773 > 3 DedicatedWorker-19604 > 1 gmain-10668 > 139 qemu-system-x86-13763 > 104 TaskSchedulerBa-6011 > > So I agree that it looks like your system seems to be overloaded. That being said, though, it makes sense to double check whether we cond_resched sufficiently because being stuck for 10s while a handful number of processes is reclaiming sounds still too much. Maybe there is more than just the reclaim going on. This would require a deeper inspection of the trace data and maybe even gather other tracepoints. I am sorry but I cannot promis I will find time to look at this anytime soon. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>