Re: [PATCH 0/3] OOM detection rework v4

Michal Hocko <mhocko@xxxxxxxxxx> · Wed, 2 Mar 2016 13:24:11 +0100

On Tue 01-03-16 19:14:08, Vlastimil Babka wrote:
> On 03/01/2016 02:38 PM, Michal Hocko wrote:
[...]
> >that means that compaction is even not tried in half cases! This
> >doesn't sounds right to me, especially when we are talking about
> ><= PAGE_ALLOC_COSTLY_ORDER requests which are implicitly nofail, because
> >then we simply rely on the order-0 reclaim to automagically form higher
> >blocks. This might indeed work when we retry many times but I guess this
> >is not a good approach. It leads to a excessive reclaim and the stall
> >for allocation can be really large.
> >
> >One of the suspicious places is __compaction_suitable which does order-0
> >watermark check (increased by 2<<order). I have put another trace_printk
> >there and it clearly pointed out this was the case.
> 
> Yes, compaction is historically quite careful to avoid making low memory
> conditions worse, and to prevent work if it doesn't look like it can
> ultimately succeed the allocation (so having not enough base pages means
> that compacting them is considered pointless).

The compaction is running in PF_MEMALLOC context so it shouldn't fail
the allocation. Moreover the additional memory is only temporal until
the migration finishes. Or am I missing something?

> This aspect of preventing non-zero-order OOMs is somewhat unexpected
> :)

I hope we can do something about it then...

[...]
> >this is worse because we have scanned more pages for migration but the
> >overall success rate was much smaller and the direct reclaim was invoked
> >more. I do not have a good theory for that and will play with this some
> >more. Maybe other changes are needed deeper in the compaction code.
> 
> I was under impression that similar checks to compaction_suitable() were
> done also in compact_finished(), to stop compacting if memory got low due to
> parallel activity. But I guess it was a patch from Joonsoo that didn't get
> merged.
> 
> My only other theory so far is that watermark checks fail in
> __isolate_free_page() when we want to grab page(s) as migration targets.

yes this certainly contributes to the problem and triggered in my case a
lot:
$ grep __isolate_free_page trace.log | wc -l
181
$ grep __alloc_pages_direct_compact: trace.log | wc -l
7

> I would suggest enabling all compaction tracepoint and the migration
> tracepoint. Looking at the trace could hopefully help faster than
> going one trace_printk() per attempt.

OK, here we go with both watermarks checks removed and hopefully all the
compaction related tracepoints enabled:
echo 1 > /debug/tracing/events/compaction/enable
echo 1 > /debug/tracing/events/migrate/mm_migrate_pages/enable

this was without the hugetlb handicap. See the trace log and vmstat
after the run attached.

Thanks
-- 
Michal Hocko
SUSE Labs
nr_free_pages 151306
nr_alloc_batch 123
nr_inactive_anon 12815
nr_active_anon 44507
nr_inactive_file 1160
nr_active_file 5910
nr_unevictable 0
nr_mlock 0
nr_anon_pages 232
nr_mapped 1025
nr_file_pages 64246
nr_dirty 2
nr_writeback 0
nr_slab_reclaimable 12344
nr_slab_unreclaimable 21129
nr_page_table_pages 260
nr_kernel_stack 90
nr_unstable 0
nr_bounce 0
nr_vmscan_write 362270
nr_vmscan_immediate_reclaim 43
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 54592
nr_dirtied 5363
nr_written 364001
nr_pages_scanned 0
workingset_refault 16574
workingset_activate 9062
workingset_nodereclaim 640
nr_anon_transparent_hugepages 0
nr_free_cma 0
nr_dirty_threshold 31188
nr_dirty_background_threshold 15594
pgpgin 564127
pgpgout 1457932
pswpin 85569
pswpout 362180
pgalloc_dma 226916
pgalloc_dma32 21472873
pgalloc_normal 0
pgalloc_movable 0
pgfree 22057596
pgactivate 174766
pgdeactivate 919764
pgfault 23950701
pgmajfault 31819
pglazyfreed 0
pgrefill_dma 15589
pgrefill_dma32 999305
pgrefill_normal 0
pgrefill_movable 0
pgsteal_kswapd_dma 5339
pgsteal_kswapd_dma32 322951
pgsteal_kswapd_normal 0
pgsteal_kswapd_movable 0
pgsteal_direct_dma 334
pgsteal_direct_dma32 71877
pgsteal_direct_normal 0
pgsteal_direct_movable 0
pgscan_kswapd_dma 11213
pgscan_kswapd_dma32 653096
pgscan_kswapd_normal 0
pgscan_kswapd_movable 0
pgscan_direct_dma 670
pgscan_direct_dma32 137488
pgscan_direct_normal 0
pgscan_direct_movable 0
pgscan_direct_throttle 0
pginodesteal 0
slabs_scanned 1920
kswapd_inodesteal 0
kswapd_low_wmark_hit_quickly 351
kswapd_high_wmark_hit_quickly 13
pageoutrun 458
allocstall 1376
pgrotated 360480
drop_pagecache 0
drop_slab 0
pgmigrate_success 204875
pgmigrate_fail 169
compact_migrate_scanned 343087
compact_free_scanned 3597902
compact_isolated 412234
compact_stall 163
compact_fail 99
compact_success 64
compact_kcompatd_wake 2
htlb_buddy_alloc_success 0
htlb_buddy_alloc_fail 0
unevictable_pgs_culled 1089
unevictable_pgs_scanned 0
unevictable_pgs_rescued 1561
unevictable_pgs_mlocked 1561
unevictable_pgs_munlocked 1561
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
thp_fault_alloc 152
thp_fault_fallback 39
thp_collapse_alloc 69
thp_collapse_alloc_failed 11
thp_split_page 1
thp_split_page_failed 0
thp_deferred_split_page 212
thp_split_pmd 10
thp_zero_page_alloc 2
thp_zero_page_alloc_failed 1
Attachment:
trace.log.gz

Description: application/gzip