Re: [PATCH 1/2] mm,oom_reaper: Show trace of unable to reap victim thread.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 20-03-18 20:57:55, Tetsuo Handa wrote:
> I found that it is not difficult to hit "oom_reaper: unable to reap pid:"
> messages if the victim thread is doing copy_process(). Since I noticed
> that it is likely helpful to show trace of unable to reap victim thread
> for finding locations which should use killable wait, this patch does so.
> 
> [  226.608508] oom_reaper: unable to reap pid:9261 (a.out)
> [  226.611971] a.out           D13056  9261   6927 0x00100084
> [  226.615879] Call Trace:
> [  226.617926]  ? __schedule+0x25f/0x780
> [  226.620559]  schedule+0x2d/0x80
> [  226.623356]  rwsem_down_write_failed+0x2bb/0x440
> [  226.626426]  ? rwsem_down_write_failed+0x55/0x440
> [  226.629458]  ? anon_vma_fork+0x124/0x150
> [  226.632679]  call_rwsem_down_write_failed+0x13/0x20
> [  226.635884]  down_write+0x49/0x60
> [  226.638867]  ? copy_process.part.41+0x12f2/0x1fe0
> [  226.642042]  copy_process.part.41+0x12f2/0x1fe0 /* i_mmap_lock_write() in dup_mmap() */
> [  226.645087]  ? _do_fork+0xe6/0x560
> [  226.647991]  _do_fork+0xe6/0x560
> [  226.650495]  ? syscall_trace_enter+0x1a9/0x240
> [  226.653443]  ? retint_user+0x18/0x18
> [  226.656601]  ? page_fault+0x2f/0x50
> [  226.659159]  ? trace_hardirqs_on_caller+0x11f/0x1b0
> [  226.662399]  do_syscall_64+0x74/0x230
> [  226.664989]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

A single stack trace in the changelog would be sufficient IMHO.
Appart from that. What do you expect users will do about this trace?
Sure they will see a path which holds mmap_sem, we will see a bug report
but we can hardly do anything about that. We simply cannot drop the lock
from that path in 99% of situations. So _why_ do we want to add more
information to the log?

[...]

> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> ---
>  mm/oom_kill.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5336985..900300c 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -41,6 +41,7 @@
>  #include <linux/kthread.h>
>  #include <linux/init.h>
>  #include <linux/mmu_notifier.h>
> +#include <linux/sched/debug.h>
>  
>  #include <asm/tlb.h>
>  #include "internal.h"
> @@ -596,6 +597,7 @@ static void oom_reap_task(struct task_struct *tsk)
>  
>  	pr_info("oom_reaper: unable to reap pid:%d (%s)\n",
>  		task_pid_nr(tsk), tsk->comm);
> +	sched_show_task(tsk);
>  	debug_show_all_locks();
>  
>  done:
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux