Re: [RFC PATCH 1/1] mm/hugetlb mm/oom_kill: Add support for reclaiming hugepages on OOM events.

"Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx> · Tue, 1 Aug 2017 10:41:56 -0400

* Michal Hocko <mhocko@xxxxxxxxxx> [170801 04:30]:
> On Mon 31-07-17 21:11:25, Liam R. Howlett wrote:
> > * Michal Hocko <mhocko@xxxxxxxxxx> [170731 10:08]:
> > > On Mon 31-07-17 09:56:48, Liam R. Howlett wrote:
> [...]
> > > > No,  I'm talking about failed memory for whatever reason.  The system
> > > > reboots by a hardware means (I believe the memory controller) and
> > > > removes the memory on that failed module from the pool.  Now you
> > > > effectively have a system with less memory than before which invalidates
> > > > your configuration.  Is it worth while to have Linux successfully boot
> > > > when the system attempts to recover from a failure?
> > > 
> > > Cetainly yes but if you boot with much less memory and you want to use
> > > hugetlb pages then you have to reconsider and maybe even reconfigure
> > > your workload to reflect new conditions. So I am not really sure this
> > > can be fully automated.
> > > 
> > 
> > I agree.  A reconfiguration or repair is required to have optimum
> > performance.  Would you agree that having functioning system better than
> > a reboot loop or hang on a panic?  It's also easier to reconfigure a
> > system that's booting.
> 
> Absolutely. The thing is that I am not even sure that the hugetlb
> problem is real. Using hugetlb reservation from the boot command line
> parameter is easily fixable (just update the boot comand line from the
> boot loader). From my experience the init time hugetlb initialization
> is usually trying to be portable and as such configures a certain
> percentage of the available memory for hugetlb (some of them even on per
> NUMA node basis). Even if somebody uses hard coded values then this is
> something that is fixable during recovery.

This was my thought when I was first assigned the bug for my last patch
for adding the log message of the hugetlb allocation failure but during
our discussion I was assigned two more near-identical bugs.  From what I
can tell the people following a setup guide do not know how to edit the
grub command line easily once in a boot loop or don't have a decent
enough console setup to do so.  Worse yet, all three of the bugs were
filed as kernel bugs because people didn't even realise it was a setup
issue.  I think the sysctl way of setting the hugetlb is the safest.
But since we provide a kernel command line way of setting the hugetlb,
it seems reasonable to make the user error as transparent as possible.
This RFC was an extension of looking at how people arrive at an OOM
error on boot when using hugetlb.

> 
> That being said I am not sure you are focusing on a real problem while
> the solution you are proposing might break an existing userspace. Please
> try to play with your memory recovery feature some more with real
> hugetlb usecases (Oracle DB is a heavy user AFAIR) and see what the real
> life problems might happen and we can revisit potential solutions with
> more data in hands.

Okay, thank you.  I will re-examine the issue and see about a different
approach.  I appreciate the time you have taken to look at my RFC.

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>