On Thu 18-01-18 12:01:32, Eric Anholt wrote: > Michal Hocko <mhocko at kernel.org> writes: > > > On Thu 18-01-18 18:00:06, Michal Hocko wrote: > >> On Thu 18-01-18 11:47:48, Andrey Grodzovsky wrote: > >> > Hi, this series is a revised version of an RFC sent by Christian König > >> > a few years ago. The original RFC can be found at > >> > https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html > >> > > >> > This is the same idea and I've just adressed his concern from the original RFC > >> > and switched to a callback into file_ops instead of a new member in struct file. > >> > >> Please add the full description to the cover letter and do not make > >> people hunt links. > >> > >> Here is the origin cover letter text > >> : I'm currently working on the issue that when device drivers allocate memory on > >> : behalf of an application the OOM killer usually doesn't knew about that unless > >> : the application also get this memory mapped into their address space. > >> : > >> : This is especially annoying for graphics drivers where a lot of the VRAM > >> : usually isn't CPU accessible and so doesn't make sense to map into the > >> : address space of the process using it. > >> : > >> : The problem now is that when an application starts to use a lot of VRAM those > >> : buffers objects sooner or later get swapped out to system memory, but when we > >> : now run into an out of memory situation the OOM killer obviously doesn't knew > >> : anything about that memory and so usually kills the wrong process. > > > > OK, but how do you attribute that memory to a particular OOM killable > > entity? And how do you actually enforce that those resources get freed > > on the oom killer action? > > > >> : The following set of patches tries to address this problem by introducing a per > >> : file OOM badness score, which device drivers can use to give the OOM killer a > >> : hint how many resources are bound to a file descriptor so that it can make > >> : better decisions which process to kill. > > > > But files are not killable, they can be shared... In other words this > > doesn't help the oom killer to make an educated guess at all. > > Maybe some more context would help the discussion? > > The struct file in patch 3 is the DRM fd. That's effectively "my > process's interface to talking to the GPU" not "a single GPU resource". > Once that file is closed, all of the process's private, idle GPU buffers > will be immediately freed (this will be most of their allocations), and > some will be freed once the GPU completes some work (this will be most > of the rest of their allocations). > > Some GEM BOs won't be freed just by closing the fd, if they've been > shared between processes. Those are usually about 8-24MB total in a > process, rather than the GBs that modern apps use (or that our testcases > like to allocate and thus trigger oomkilling of the test harness instead > of the offending testcase...) > > Even if we just had the private+idle buffers being accounted in OOM > badness, that would be a huge step forward in system reliability. OK, in that case I would propose a different approach. We already have rss_stat. So why do not we simply add a new counter there MM_KERNELPAGES and consider those in oom_badness? The rule would be that such a memory is bound to the process life time. I guess we will find more users for this later. -- Michal Hocko SUSE Labs