On Thu 09-06-22 16:29:46, Christian König wrote: [...] > Is that a show stopper? How should we address this? This is a hard problem to deal with and I am not sure this simple solution is really a good fit. Not only because of the memcg side of things. I have my doubts that sparse files handling is ok as well. I do realize this is a long term problem and there is a demand for some solution at least. I am not sure how to deal with shared resources myself. The best approximation I can come up with is to limit the scope of the damage into a memcg context. One idea I was playing with (but never convinced myself it is really a worth) is to allow a new mode of the oom victim selection for the global oom event. It would be an opt in and the victim would be selected from the biggest leaf memcg (or kill the whole memcg if it has group_oom configured. That would address at least some of the accounting issue because charges are better tracked than per process memory consumption. It is a crude and ugly hack and it doesn't solve the underlying problem as shared resources are not guaranteed to be freed when processes die but maybe it would be just slightly better than the existing scheme which is clearly lacking behind existing userspace. -- Michal Hocko SUSE Labs