On Tue, Apr 30, 2019 at 01:41:16PM -0700, Vaibhav Rustagi wrote: > On Wed, Apr 24, 2019 at 11:53 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > A: Because it messes up the order in which people normally read text. > > Q: Why is top-posting such a bad thing? > > A: Top-posting. > > Q: What is the most annoying thing in e-mail? > > > > A: No. > > Q: Should I include quotations after my reply? > > > > http://daringfireball.net/2007/07/on_top > > > > On Wed, Apr 24, 2019 at 10:35:51AM -0700, Vaibhav Rustagi wrote: > > > Apologies for sending a non-plain text e-mail previously. > > > > > > This issue is encountered in the actual production environment by our > > > customers where they are constantly creating containers > > > and tearing them down (using kubernetes for the workload). Kubernetes > > > constantly reads the memory.stat file for accounting memory > > > information and over time (around a week) the memcg's got accumulated > > > and the response time for reading memory.stat increases and > > > customer applications get affected. > > > > Please define "affected". Their apps still run properly, so all should > > be fine, it would be kubernetes that sees the slowdowns, not the > > application. How exactly does this show up to an end-user? > > > > Over time as the zombie cgroups get accumulated, kubelet (process > doing frequent memory.stat) becomes more cpu resource intensive and > all other user containers running on the same machine will starve for > cpu. It affects the user containers in at-least 2 ways that we know > of: (1) User experience liveness probe failures where there > applications are not completed in expected amount of time. "expected amount of time" is interesting to claim in a shared environment :) > (2) new user jobs cannot be schedule, Really? This slows down starting new processes? Or is this just slowing down your system overall? > There certainly is a possibilty of reducing the adverse affect at > Kubernetes level as well, and we are investigating that as well. But, > the kernel patches requested helps in not exacerbating the problem. I understand this is a kernel issue, but if you see this happen, just updating to a modern kernel should be fine. > > > The repro steps mentioned previously was just used for testing the > > > patches locally. > > > > > > Yes, we are moving to 4.19 but are also supporting 4.14 till Jan 2020 > > > (so production environment will still contain 4.14 kernel) > > > > If you are already moving to 4.19, this seems like a good as reason as > > any (hint, I can give you more) to move off of 4.14 at this point in > > time. There's no real need to keep 4.14 around, given that you don't > > have any out-of-tree code in your kernels, so all should be simple to > > just update the next reboot, right? > > > > Based on the past experiences, major kernel upgrade sometime > introduces new regressions as well. So while we are working to roll > out kernel 4.19, it may not be a practical solution for all the users. If you are not doing the same exact testing senario for a new 4.14.y kernel release as you are doing for a move to 4.19.y, then your "roll out" process is broken. Given that 4.19.y is now 6 months old, I would have expected any "new regressions" to have already been reported. Please just use a new kernel, and if you have regressions, we will work to address them. thanks, greg k-h