Hello, Lemme elaborate just a bit more. On Tue, Jul 24, 2018 at 07:28:20AM -0700, Tejun Heo wrote: > Hello, > > On Tue, Jul 24, 2018 at 04:25:54PM +0200, Michal Hocko wrote: > > I am sorry but I do not follow. Besides that modeling the behavior on > > panic_on_oom doesn't really sound very appealing to me. The knob is a > > crude hack mostly motivated by debugging (at least its non-global > > variants). > > Hmm... we actually do use that quite a bit in production (moving away > from it gradually). So, the reason panic_on_oom is used is very similar for the reason one would want group oom kill - workload integrity after an oom kill. panic_on_oom is an expensive way of avoiding partial kills and the resulting possibly inconsistent state. Group oom can scope that down so that we can maintain integrity per-application or domain rather than at system level making it way cheaper. > > So can we get back to workloads and shape the semantic on top of that > > please? > > I didn't realize we were that off track. Don't both map to what we > were discussing almost perfectly? I guess the reason why panic_on_oom developed the two behaviors is likely that the initial behavior - panicking on any oom - was too inflexible. We're scoping it down, so whatever problems we used to have with panic_on_oom is less pronounced with group oom. So, I don't think this matters all that much in terms of practical usefulness. Both always kliling and factoring in oom origin seem fine to me. Let's just pick one. Thanks. -- tejun