On Wed 13-07-16 16:53:28, David Rientjes wrote: > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > tries to fix? > > > > It prevents the whole system from livelocking due to an oom killed process > stalling forever waiting for mempool_alloc() to return. No other threads > may be oom killed while waiting for it to exit. But it is true that the patch has unintended side effect for any mempool allocation from the reclaim path (aka PF_MEMALLOC context). So do you think we should rework your additional patch to be explicit about TIF_MEMDIE? Something like the following (not even compile tested for illustration). Tetsuo has properly pointed out that this doesn't work for multithreaded processes reliable but put that aside for now as that needs a fix on a different layer. I believe we can fix that quite easily after recent/planned changes. --- diff --git a/mm/mempool.c b/mm/mempool.c index 8f65464da5de..ea26d75c8adf 100644 --- a/mm/mempool.c +++ b/mm/mempool.c @@ -322,20 +322,20 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM); + gfp_mask |= __GFP_NOMEMALLOC; /* don't allocate emergency reserves */ gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */ gfp_mask |= __GFP_NOWARN; /* failures are OK */ gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO); repeat_alloc: - if (likely(pool->curr_nr)) { - /* - * Don't allocate from emergency reserves if there are - * elements available. This check is racy, but it will - * be rechecked each loop. - */ - gfp_temp |= __GFP_NOMEMALLOC; - } + /* + * Make sure that the OOM victim will get access to memory reserves + * properly if there are no objects in the pool to prevent from + * livelocks. + */ + if (!likely(pool->curr_nr) && test_thread_flag(TIF_MEMDIE)) + gfp_temp &= ~__GFP_NOMEMALLOC; element = pool->alloc(gfp_temp, pool->pool_data); if (likely(element != NULL)) @@ -359,7 +359,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) * We use gfp mask w/o direct reclaim or IO for the first round. If * alloc failed with that and @pool was empty, retry immediately. */ - if ((gfp_temp & ~__GFP_NOMEMALLOC) != gfp_mask) { + if ((gfp_temp & __GFP_DIRECT_RECLAIM) != (gfp_mask & __GFP_DIRECT_RECLAIM)) { spin_unlock_irqrestore(&pool->lock, flags); gfp_temp = gfp_mask; goto repeat_alloc; -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>