On Wed 22-11-23 12:46:44, gaoxu wrote: > The function queue_oom_reaper tests and sets tsk->signal->oom_mm->flags. > However, it is necessary to check if 'tsk' is an OOM victim before > executing 'queue_oom_reaper' because the variable may be NULL. > > We encountered such an issue, and the log is as follows: > [3701:11_see]Out of memory: Killed process 3154 (system_server) > total-vm:23662044kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB, > UID:1000 pgtables:4056kB oom_score_adj:-900 > [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set null_pointer > [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set unknown_addr What are these? > [3701:11_see]Unable to handle kernel NULL pointer dereference at virtual > address 0000000000000328 > [3701:11_see]user pgtable: 4k pages, 39-bit VAs, pgdp=00000000821de000 > [3701:11_see][0000000000000328] pgd=0000000000000000, > p4d=0000000000000000,pud=0000000000000000 > [3701:11_see]tracing off > [3701:11_see]Internal error: Oops: 96000005 [#1] PREEMPT SMP > [3701:11_see]Call trace: > [3701:11_see] queue_oom_reaper+0x30/0x170 Could you resolve this offset into the code line please? > [3701:11_see] __oom_kill_process+0x590/0x860 > [3701:11_see] oom_kill_process+0x140/0x274 > [3701:11_see] out_of_memory+0x2f4/0x54c > [3701:11_see] __alloc_pages_slowpath+0x5d8/0xaac > [3701:11_see] __alloc_pages+0x774/0x800 > [3701:11_see] wp_page_copy+0xc4/0x116c > [3701:11_see] do_wp_page+0x4bc/0x6fc > [3701:11_see] handle_pte_fault+0x98/0x2a8 > [3701:11_see] __handle_mm_fault+0x368/0x700 > [3701:11_see] do_handle_mm_fault+0x160/0x2cc > [3701:11_see] do_page_fault+0x3e0/0x818 > [3701:11_see] do_mem_abort+0x68/0x17c > [3701:11_see] el0_da+0x3c/0xa0 > [3701:11_see] el0t_64_sync_handler+0xc4/0xec > [3701:11_see] el0t_64_sync+0x1b4/0x1b8 > [3701:11_see]tracing off > > Signed-off-by: Gao Xu <gaoxu2@xxxxxxxxxxx> > --- > mm/oom_kill.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 9e6071fde..3754ab4b6 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -984,7 +984,7 @@ static void __oom_kill_process(struct task_struct *victim, const char *message) > } > rcu_read_unlock(); > > - if (can_oom_reap) > + if (can_oom_reap && tsk_is_oom_victim(victim)) > queue_oom_reaper(victim); I do not understand. We always do send SIGKILL and call mark_oom_victim(victim); on victim task when reaching out here. How can tsk_is_oom_victim can ever be false? > > mmdrop(mm); > -- > 2.17.1 > > -- Michal Hocko SUSE Labs