Re: [RFC PATCH 1/1] mm/hugetlb mm/oom_kill: Add support for reclaiming hugepages on OOM events.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 31-07-17 07:37:35, Matthew Wilcox wrote:
> On Mon, Jul 31, 2017 at 04:08:10PM +0200, Michal Hocko wrote:
> > On Mon 31-07-17 09:56:48, Liam R. Howlett wrote:
[...]
> > > My focus on hugetlb is that it can stop the automatic recovery of the
> > > system.
> > 
> > How?
> 
> Let me try to explain the situation as I understand it.
> 
> The customer has purchased a 128TB machine in order to run a database.
> They reserve 124TB of memory for use by the database cache.  Everything
> works great.  Then a 4TB memory module goes bad.  The machine reboots
> itself in order to return to operation, now having only 124TB of memory
> and having 124TB of memory reserved.  It OOMs during boot.  The current
> output from our OOM machinery doesn't point the sysadmin at the kernel
> command line parameter as now being the problem.  So they file a priority
> 1 problem ticket ...

Well, I would argue that the oom report is quite clear that the hugetlb
memory has consumed the large part if not whole usable memory and that
should give a clue...

Nevertheless, I can see some merit here, but I am arguing that there
is simply no good way to handle this without admin involvement
unless we want to risk other and much more subtle breakage where the
application really expects it can consume the preallocated hugetlb pool
completely. And I would even argue that the later is more probable than
unintended memory failure reboot cycle. If somebody can tune hugetlb
pool dynamically I would recommend doing so from an init script.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux