On Mon, Jul 16, 2018 at 04:12:49PM -0700, Andrew Morton wrote: > On Mon, 16 Jul 2018 13:50:58 +0200 Michal Hocko <mhocko at kernel.org> wrote: > > > From: Michal Hocko <mhocko at suse.com> > > > > There are several blockable mmu notifiers which might sleep in > > mmu_notifier_invalidate_range_start and that is a problem for the > > oom_reaper because it needs to guarantee a forward progress so it cannot > > depend on any sleepable locks. > > > > Currently we simply back off and mark an oom victim with blockable mmu > > notifiers as done after a short sleep. That can result in selecting a > > new oom victim prematurely because the previous one still hasn't torn > > its memory down yet. > > > > We can do much better though. Even if mmu notifiers use sleepable locks > > there is no reason to automatically assume those locks are held. > > Moreover majority of notifiers only care about a portion of the address > > space and there is absolutely zero reason to fail when we are unmapping an > > unrelated range. Many notifiers do really block and wait for HW which is > > harder to handle and we have to bail out though. > > > > This patch handles the low hanging fruid. __mmu_notifier_invalidate_range_start > > gets a blockable flag and callbacks are not allowed to sleep if the > > flag is set to false. This is achieved by using trylock instead of the > > sleepable lock for most callbacks and continue as long as we do not > > block down the call chain. > > I assume device driver developers are wondering "what does this mean > for me". As I understand it, the only time they will see > blockable==false is when their driver is being called in response to an > out-of-memory condition, yes? So it is a very rare thing. I can't say for everyone, but at least for me (mlx5), it is not rare event. I'm seeing OOM very often while I'm running my tests in low memory VMs. Thanks > > Any suggestions regarding how the driver developers can test this code > path? I don't think we presently have a way to fake an oom-killing > event? Perhaps we should add such a thing, given the problems we're > having with that feature. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180717/6686ee80/attachment-0001.sig>