On Thu, Apr 11, 2019 at 10:22:22AM -0400, Waiman Long wrote: > On 04/10/2019 05:38 PM, Chris Down wrote: > > Hi Waiman, > > > > Waiman Long writes: > >> The current control mechanism for memory cgroup v2 lumps all the memory > >> together irrespective of the type of memory objects. However, there > >> are cases where users may have more concern about one type of memory > >> usage than the others. > > > > I have concerns about this implementation, and the overall idea in > > general. We had per-class memory limiting in the cgroup v1 API, and it > > ended up really poorly, and resulted in a situation where it's really > > hard to compose a usable system out of it any more. > > > > A major part of the restructure in cgroup v2 has been to simplify > > things so that it's more easy to understand for service owners and > > sysadmins. This was intentional, because otherwise the system overall > > is hard to make into something that does what users *really* want, and > > users end up with a lot of confusion, misconfiguration, and generally > > an inability to produce a coherent system, because we've made things > > too hard to piece together. > > > > In general, for purposes of resource control, I'm not convinced that > > it makes sense to limit only one kind of memory based on prior > > experience with v1. Can you give a production use case where this > > would be a clear benefit, traded off against the increase in > > complexity to the API? > > > > As I said in my previous email on this thread, the customer considered > pages cache as common goods not fully representing the "real" memory > footprint used by an application. Depending on actual mix of > applications running on a system, there are certainly cases where their > view is correct. In fact, what the customer is asking for is not even > provided by the v1 API even with that many classes of memory that you > can choose from. Hello Waiman! If I understand the case correctly, the customer wants to get signaled when anon memory consumption will reach a certain point, right? I doubt that the idea is to keep only the certain amount of anon memory resident and swap out everything else. So, probably, the reaction will be to kill the application. If so, do we really need a control? Maybe polling memory.stats::anon will be enough? If not, I can imagine some sort of threshold notification mechanism on top of memory.stats. Similar to what is build on top of psi. Tracking the size of anon memory is definitely useful for spotting userspace leaks and spkies, so an ability to set up thresholds and get events sounds appealing to me. Thanks! Roman