Re: [PATCH] mm/page_owner: print largest memory consumer when OOM panic occurs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Dec 23, 2019, at 6:33 AM, Miles Chen <miles.chen@xxxxxxxxxxxx> wrote:
> 
> Motivation:
> -----------
> 
> When debug with a OOM kernel panic, it is difficult to know the
> memory allocated by kernel drivers of vmalloc() by checking the
> Mem-Info or Node/Zone info. For example:
> 
>  Mem-Info:
>  active_anon:5144 inactive_anon:16120 isolated_anon:0
>   active_file:0 inactive_file:0 isolated_file:0
>   unevictable:0 dirty:0 writeback:0 unstable:0
>   slab_reclaimable:739 slab_unreclaimable:442469
>   mapped:534 shmem:21050 pagetables:21 bounce:0
>   free:14808 free_pcp:3389 free_cma:8128
> 
>  Node 0 active_anon:20576kB inactive_anon:64480kB active_file:0kB
>  inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
>  mapped:2136kB dirty:0kB writeback:0kB shmem:84200kB shmem_thp: 0kB
>  shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB
>  all_unr eclaimable? yes
> 
>  Node 0 DMA free:14476kB min:21512kB low:26888kB high:32264kB
>  reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
>  active_file: 0kB inactive_file:0kB unevictable:0kB writepending:0kB
>  present:1048576kB managed:952736kB mlocked:0kB kernel_stack:0kB
>  pagetables:0kB bounce:0kB free_pcp:2716kB local_pcp:0kB free_cma:0kB
> 
> The information above tells us the memory usage of the known memory
> categories and we can check the abnormal large numbers. However, if a
> memory leakage cannot be observed in the categories above, we need to
> reproduce the issue with CONFIG_PAGE_OWNER.
> 
> It is possible to read the page owner information from coredump files.
> However, coredump files may not always be available, so my approach is
> to print out the largest page consumer when OOM kernel panic occurs.

Many of those patches helping debugging special cases had been shot down in the past. I don’t see much difference this time. If you worry about memory leak, enable kmemleak and then to reproduce. Otherwise, we will end up with too many heuristics just for debugging.

> 
> The heuristic approach assumes that the OOM kernel panic is caused by
> a single backtrace. The assumption is not always true but it works in
> many cases during our test.
> 
> We have tested this heuristic approach since 2019/5 on android devices.
> In 38 internal OOM kernel panic reports:
> 
> 31/38: can be analyzed by using existing information
> 7/38: need page owner formatino and the heuristic approach in this patch
> prints the correct backtraces of abnormal memory allocations. No need to
> reproduce the issues.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux