Re: [For Stable] mm: memcontrol: fix excessive complexity in memory.stat reporting

Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> · Wed, 1 May 2019 09:08:27 +0200

On Tue, Apr 30, 2019 at 01:41:16PM -0700, Vaibhav Rustagi wrote:
> On Wed, Apr 24, 2019 at 11:53 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> >
> > A: Because it messes up the order in which people normally read text.
> > Q: Why is top-posting such a bad thing?
> > A: Top-posting.
> > Q: What is the most annoying thing in e-mail?
> >
> > A: No.
> > Q: Should I include quotations after my reply?
> >
> > http://daringfireball.net/2007/07/on_top
> >
> > On Wed, Apr 24, 2019 at 10:35:51AM -0700, Vaibhav Rustagi wrote:
> > > Apologies for sending a non-plain text e-mail previously.
> > >
> > > This issue is encountered in the actual production environment by our
> > > customers where they are constantly creating containers
> > > and tearing them down (using kubernetes for the workload).  Kubernetes
> > > constantly reads the memory.stat file for accounting memory
> > > information and over time (around a week) the memcg's got accumulated
> > > and the response time for reading memory.stat increases and
> > > customer applications get affected.
> >
> > Please define "affected".  Their apps still run properly, so all should
> > be fine, it would be kubernetes that sees the slowdowns, not the
> > application.  How exactly does this show up to an end-user?
> >
> 
> Over time as the zombie cgroups get accumulated, kubelet (process
> doing frequent memory.stat) becomes more cpu resource intensive and
> all other user containers running on the same machine will starve for
> cpu. It affects the user containers in at-least 2 ways that we know
> of: (1) User experience liveness probe failures where there
> applications are not completed in expected amount of time.

"expected amount of time" is interesting to claim in a shared
environment :)

> (2) new user jobs cannot be schedule,

Really?  This slows down starting new processes?  Or is this just
slowing down your system overall?

> There certainly is a possibilty of reducing the adverse affect at
> Kubernetes level as well, and we are investigating that as well. But,
> the kernel patches requested helps in not exacerbating the problem.

I understand this is a kernel issue, but if you see this happen, just
updating to a modern kernel should be fine.

> > > The repro steps mentioned previously was just used for testing the
> > > patches locally.
> > >
> > > Yes, we are moving to 4.19 but are also supporting 4.14 till Jan 2020
> > > (so production environment will still contain 4.14 kernel)
> >
> > If you are already moving to 4.19, this seems like a good as reason as
> > any (hint, I can give you more) to move off of 4.14 at this point in
> > time.  There's no real need to keep 4.14 around, given that you don't
> > have any out-of-tree code in your kernels, so all should be simple to
> > just update the next reboot, right?
> >
> 
> Based on the past experiences, major kernel upgrade sometime
> introduces new regressions as well. So while we are working to roll
> out kernel 4.19, it may not be a practical solution for all the users.

If you are not doing the same exact testing senario for a new 4.14.y
kernel release as you are doing for a move to 4.19.y, then your "roll
out" process is broken.

Given that 4.19.y is now 6 months old, I would have expected any "new
regressions" to have already been reported.  Please just use a new
kernel, and if you have regressions, we will work to address them.

thanks,

greg k-h