On Fri, Aug 24, 2018 at 02:33:41PM +0200, Michal Hocko wrote: > On Fri 24-08-18 14:18:44, Christian König wrote: > > Am 24.08.2018 um 14:03 schrieb Michal Hocko: > > > On Fri 24-08-18 13:57:52, Christian König wrote: > > > > Am 24.08.2018 um 13:52 schrieb Michal Hocko: > > > > > On Fri 24-08-18 13:43:16, Christian König wrote: > > > [...] > > > > > > That won't work like this there might be multiple > > > > > > invalidate_range_start()/invalidate_range_end() pairs open at the same time. > > > > > > E.g. the lock might be taken recursively and that is illegal for a > > > > > > rw_semaphore. > > > > > I am not sure I follow. Are you saying that one invalidate_range might > > > > > trigger another one from the same path? > > > > No, but what can happen is: > > > > > > > > invalidate_range_start(A,B); > > > > invalidate_range_start(C,D); > > > > ... > > > > invalidate_range_end(C,D); > > > > invalidate_range_end(A,B); > > > > > > > > Grabbing the read lock twice would be illegal in this case. > > > I am sorry but I still do not follow. What is the context the two are > > > called from? > > > > I don't have the slightest idea. > > > > > Can you give me an example. I simply do not see it in the > > > code, mostly because I am not familiar with it. > > > > I'm neither. > > > > We stumbled over that by pure observation and after discussing the problem > > with Jerome came up with this solution. > > > > No idea where exactly that case comes from, but I can confirm that it indeed > > happens. > > Thiking about it some more, I can imagine that a notifier callback which > performs an allocation might trigger a memory reclaim and that in turn > might trigger a notifier to be invoked and recurse. But notifier > shouldn't really allocate memory. They are called from deep MM code > paths and this would be extremely deadlock prone. Maybe Jerome can come > up some more realistic scenario. If not then I would propose to simplify > the locking here. We have lockdep to catch self deadlocks and it is > always better to handle a specific issue rather than having a code > without a clear indication how it can recurse. Multiple concurrent mmu notifier, for overlapping range or not, is common (each concurrent threads can trigger some). So you might have multiple invalidate_range_start() in flight for same mm and thus might complete in different order (invalidate_range_end()). IIRC this is what this lock was trying to protect against. I can't think of a reason for recursive mmu notifier call right now. I will ponder see if i remember something about it. Cheers, Jérôme