Michal Hocko wrote: > > In this case, the oom reaper has ignored the next victim and doesn't do > > anything; the simple race has prevented it from zapping memory and does > > not reduce the livelock probability. > > > > This can be solved either by queueing mm's to reap or involving the oom > > reaper into the oom killer synchronization itself. > > as we have already discussed previously oom reaper is really tricky to > be called from the direct OOM context. I will go with queuing. > OK. But it is not easy to build a reliable OOM-reap queuing chain. I think that a dedicated kernel thread which does OOM-kill operation and OOM-reap operation will be expected. That will also handle the "sleeping for too long with oom_lock held after sending SIGKILL" problem. > > I'm baffled by any reference to "memcg oom heavy loads", I don't > > understand this paragraph, sorry. If a memcg is oom, we shouldn't be > > disrupting the global runqueue by running oom_reaper at a high priority. > > The disruption itself is not only in first wakeup but also in how long the > > reaper can run and when it is rescheduled: for a lot of memory this is > > potentially long. The reaper is best-effort, as the changelog indicates, > > and we shouldn't have a reliance on this high priority: oom kill exiting > > can't possibly be expected to be immediate. This high priority should be > > removed so memcg oom conditions are isolated and don't affect other loads. > > If this is a concern then I would be tempted to simply disable oom > reaper for memcg oom altogether. For me it is much more important that > the reaper, even though a best effort, is guaranteed to schedule if > something goes terribly wrong on the machine. I think that if something goes terribly wrong on the machine, a guarantee for scheduling the reaper will not help unless we build a reliable queuing chain. Building a reliable queuing chain will break some of assumptions provided by current behavior. For me, a guarantee for scheduling for next OOM-kill operation (with globally opening some or all of memory reserves) before building a reliable queuing chain is much more important. > But ohh well... I will queue up a patch to do this > on top. I plan to repost the full patchset shortly. Maybe we all agree with introducing OOM reaper without queuing, but I do want to see a guarantee for scheduling for next OOM-kill operation before trying to build a reliable queuing chain. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>