On Tue 01-03-16 19:14:08, Vlastimil Babka wrote: > On 03/01/2016 02:38 PM, Michal Hocko wrote: [...] > >that means that compaction is even not tried in half cases! This > >doesn't sounds right to me, especially when we are talking about > ><= PAGE_ALLOC_COSTLY_ORDER requests which are implicitly nofail, because > >then we simply rely on the order-0 reclaim to automagically form higher > >blocks. This might indeed work when we retry many times but I guess this > >is not a good approach. It leads to a excessive reclaim and the stall > >for allocation can be really large. > > > >One of the suspicious places is __compaction_suitable which does order-0 > >watermark check (increased by 2<<order). I have put another trace_printk > >there and it clearly pointed out this was the case. > > Yes, compaction is historically quite careful to avoid making low memory > conditions worse, and to prevent work if it doesn't look like it can > ultimately succeed the allocation (so having not enough base pages means > that compacting them is considered pointless). The compaction is running in PF_MEMALLOC context so it shouldn't fail the allocation. Moreover the additional memory is only temporal until the migration finishes. Or am I missing something? > This aspect of preventing non-zero-order OOMs is somewhat unexpected > :) I hope we can do something about it then... [...] > >this is worse because we have scanned more pages for migration but the > >overall success rate was much smaller and the direct reclaim was invoked > >more. I do not have a good theory for that and will play with this some > >more. Maybe other changes are needed deeper in the compaction code. > > I was under impression that similar checks to compaction_suitable() were > done also in compact_finished(), to stop compacting if memory got low due to > parallel activity. But I guess it was a patch from Joonsoo that didn't get > merged. > > My only other theory so far is that watermark checks fail in > __isolate_free_page() when we want to grab page(s) as migration targets. yes this certainly contributes to the problem and triggered in my case a lot: $ grep __isolate_free_page trace.log | wc -l 181 $ grep __alloc_pages_direct_compact: trace.log | wc -l 7 > I would suggest enabling all compaction tracepoint and the migration > tracepoint. Looking at the trace could hopefully help faster than > going one trace_printk() per attempt. OK, here we go with both watermarks checks removed and hopefully all the compaction related tracepoints enabled: echo 1 > /debug/tracing/events/compaction/enable echo 1 > /debug/tracing/events/migrate/mm_migrate_pages/enable this was without the hugetlb handicap. See the trace log and vmstat after the run attached. Thanks -- Michal Hocko SUSE Labs
nr_free_pages 151306 nr_alloc_batch 123 nr_inactive_anon 12815 nr_active_anon 44507 nr_inactive_file 1160 nr_active_file 5910 nr_unevictable 0 nr_mlock 0 nr_anon_pages 232 nr_mapped 1025 nr_file_pages 64246 nr_dirty 2 nr_writeback 0 nr_slab_reclaimable 12344 nr_slab_unreclaimable 21129 nr_page_table_pages 260 nr_kernel_stack 90 nr_unstable 0 nr_bounce 0 nr_vmscan_write 362270 nr_vmscan_immediate_reclaim 43 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 54592 nr_dirtied 5363 nr_written 364001 nr_pages_scanned 0 workingset_refault 16574 workingset_activate 9062 workingset_nodereclaim 640 nr_anon_transparent_hugepages 0 nr_free_cma 0 nr_dirty_threshold 31188 nr_dirty_background_threshold 15594 pgpgin 564127 pgpgout 1457932 pswpin 85569 pswpout 362180 pgalloc_dma 226916 pgalloc_dma32 21472873 pgalloc_normal 0 pgalloc_movable 0 pgfree 22057596 pgactivate 174766 pgdeactivate 919764 pgfault 23950701 pgmajfault 31819 pglazyfreed 0 pgrefill_dma 15589 pgrefill_dma32 999305 pgrefill_normal 0 pgrefill_movable 0 pgsteal_kswapd_dma 5339 pgsteal_kswapd_dma32 322951 pgsteal_kswapd_normal 0 pgsteal_kswapd_movable 0 pgsteal_direct_dma 334 pgsteal_direct_dma32 71877 pgsteal_direct_normal 0 pgsteal_direct_movable 0 pgscan_kswapd_dma 11213 pgscan_kswapd_dma32 653096 pgscan_kswapd_normal 0 pgscan_kswapd_movable 0 pgscan_direct_dma 670 pgscan_direct_dma32 137488 pgscan_direct_normal 0 pgscan_direct_movable 0 pgscan_direct_throttle 0 pginodesteal 0 slabs_scanned 1920 kswapd_inodesteal 0 kswapd_low_wmark_hit_quickly 351 kswapd_high_wmark_hit_quickly 13 pageoutrun 458 allocstall 1376 pgrotated 360480 drop_pagecache 0 drop_slab 0 pgmigrate_success 204875 pgmigrate_fail 169 compact_migrate_scanned 343087 compact_free_scanned 3597902 compact_isolated 412234 compact_stall 163 compact_fail 99 compact_success 64 compact_kcompatd_wake 2 htlb_buddy_alloc_success 0 htlb_buddy_alloc_fail 0 unevictable_pgs_culled 1089 unevictable_pgs_scanned 0 unevictable_pgs_rescued 1561 unevictable_pgs_mlocked 1561 unevictable_pgs_munlocked 1561 unevictable_pgs_cleared 0 unevictable_pgs_stranded 0 thp_fault_alloc 152 thp_fault_fallback 39 thp_collapse_alloc 69 thp_collapse_alloc_failed 11 thp_split_page 1 thp_split_page_failed 0 thp_deferred_split_page 212 thp_split_pmd 10 thp_zero_page_alloc 2 thp_zero_page_alloc_failed 1
Attachment:
trace.log.gz
Description: application/gzip