Re: mm: Can we bail out p?d_alloc() loops upon SIGKILL?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/02/27 18:21, Michal Hocko wrote:
> On Wed 27-02-19 12:43:51, Tetsuo Handa wrote:
>> I noticed that when a kdump kernel triggers the OOM killer because a too
>> small value was given to crashkernel= parameter, the OOM reaper tends to
>> fail to reclaim memory from OOM victims because they are in dup_mm() from
>> copy_mm() from copy_process() with mmap_sem held for write.
> 
> I would presume that a page table allocation would fail for the oom
> victim as soon as the oom memory reserves get depleted and then
> copy_page_range would bail out and release the lock. That being
> said, the oom_reaper might bail out before then but does sprinkling
> fatal_signal_pending checks into copy_*_range really help reliably?
> 

Yes, I think so. The OOM victim was just sleeping at might_sleep_if()
rather than continue allocations until ALLOC_OOM allocation fails.
Maybe the kdump kernel enables only one CPU somehow contributed that
the OOM reaper gave up before ALLOC_OOM allocation fails. But if the OOM
victim in a normal kernel had huge memory mapping where p?d_alloc() is
called for so many times, and kernel frequently prevented the OOM victim
 from continuing ALLOC_OOM allocations, it might not be rare cases (I
don't have a huge machine for testing intensive p?d_alloc() loop) to
hit this problem.

Technically, it would be possible to use a per task_struct flag
which allows __alloc_pages_nodemask() to check early and bail out:

  down_write(&current->mm->mmap_sem);
  current->no_oom_alloc = 1;
  while (...) {
      p?d_alloc();
  }
  current->no_oom_alloc = 0;
  up_write(&current->mm->mmap_sem);




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux