On Wed, 13 Jul 2016, Michal Hocko wrote: > On Wed 13-07-16 10:18:35, Mikulas Patocka wrote: > > > > > > On Wed, 13 Jul 2016, Michal Hocko wrote: > > > > > [CC David] > > > > > > > > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. > > > > > Prior to this commit, mempool allocations set __GFP_NOMEMALLOC, so > > > > > they never exhausted reserved memory. With this commit, mempool > > > > > allocations drop __GFP_NOMEMALLOC, so they can dig deeper (if the > > > > > process has PF_MEMALLOC, they can bypass all limits). > > > > > > > > I wonder whether commit f9054c70d28bc214 ("mm, mempool: only set > > > > __GFP_NOMEMALLOC if there are free elements") is doing correct thing. > > > > It says > > > > > > > > If an oom killed thread calls mempool_alloc(), it is possible that > > > > it'll > > > > loop forever if there are no elements on the freelist since > > > > __GFP_NOMEMALLOC prevents it from accessing needed memory reserves in > > > > oom conditions. > > > > > > I haven't studied the patch very deeply so I might be missing something > > > but from a quick look the patch does exactly what the above says. > > > > > > mempool_alloc used to inhibit ALLOC_NO_WATERMARKS by default. David has > > > only changed that to allow ALLOC_NO_WATERMARKS if there are no objects > > > in the pool and so we have no fallback for the default __GFP_NORETRY > > > request. > > > > The swapper core sets the flag PF_MEMALLOC and calls generic_make_request > > to submit the swapping bio to the block driver. The device mapper driver > > uses mempools for all its I/O processing. > > OK, this is the part I have missed. I didn't realize that the swapout > path, which is indeed PF_MEMALLOC, can get down to blk code which uses > mempools. A quick code travers shows that at least > make_request_fn = blk_queue_bio > blk_queue_bio > get_request > __get_request > > might do that. And in that case I agree that the above mentioned patch > has unintentional side effects and should be re-evaluated. David, what > do you think? An obvious fixup would be considering TIF_MEMDIE in > mempool_alloc explicitly. What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 tries to fix? Do you have a stacktrace where it deadlocked, or was just a theoretical consideration? Mempool users generally (except for some flawed cases like fs_bio_set) do not require memory to proceed. So if you just loop in mempool_alloc, the processes that exhasted the mempool reserve will eventually return objects to the mempool and you should proceed. If you can't proceed, it is a bug in the code that uses the mempool. Mikulas -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>