On Tue 26-07-16 06:40:54, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Mon 25-07-16 23:02:35, Tetsuo Handa wrote: > > > Michal Hocko wrote: > > > > On Mon 25-07-16 20:47:03, Tetsuo Handa wrote: > > > > > Michal Hocko wrote: > > > > > > On Mon 25-07-16 20:07:11, Tetsuo Handa wrote: [...] > > > > > > > Then, what are advantages with allowing only OOM victims access to memory > > > > > > > reserves after they left exit_mm()? > > > > > > > > > > > > Because they might need it in order to move on... Say you want to close > > > > > > all the files which might release considerable amount of memory or any > > > > > > other post exit_mm() resources. > > > > > > > > > > OOM victims might need memory reserves in order to move on, but non OOM victims > > > > > might also need memory reserves in order to move on. And non OOM victims might > > > > > be blocking OOM victims via locks. > > > > > > > > Yes that might be true but OOM situations are rare events and quite > > > > reduced in the scope. Considering all exiting tasks is more dangerous > > > > because they might deplete those memory reserves easily. > > > > > > Why do you assume that we grant all of memory reserves? > > > > I've said deplete "those memory reserves". It would be just too easy to > > exit many tasks at once and use up that memory. > > But that will not be a problem unless an OOM event occurs. And then it might make the problem just worse. I do not want to speculate about adversary workloads but this just sounds like a bad idea in general... > Even if some > portion of memory reserves are granted, killed/exiting tasks unlikely > access memory reserves. If killed/exiting tasks need to deplete that > portion of memory reserves, it is reasonable to select an OOM victim. > > > > > > I'm suggesting that we grant portion of memory reserves. > > > > Which doesn't solve anything because it will always be a finite resource > > which can get depleted. This is basically the same as the oom victim > > (ab)using reserves accept that OOM is much less likely and it is under > > control of the kernel which task gets killed. > > Given that OOM is much less likely event, maybe we even do not need to use > task_struct->oom_reaper_list and instead we can use a global variable > > static struct mm_struct *current_oom_mm; > > and wait for current_oom_mm to become NULL regardless of in which domain an > OOM event occurred (as with we changed to use global oom_lock for preventing > concurrent OOM killer invocations)? Heh, this is very similar to what I used to have there in the beginning and you have pushed to make it a list. > Then, we can determine OOM_SCAN_ABORT by > inspecting that variable. This change may defer invocation of OOM killer in > different domains, but concurrent OOM events in different domains will be > also much less likely? Considering that there may be hundreds of memory cgroups configured then I expect we will be pushed towards more parallelism in the future. Anyway I think we went largely off topic. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>