On 06/04/2018 08:27 AM, Michal Hocko wrote: > On Fri 01-06-18 15:05:26, Qing Huang wrote: >> >> >> On 6/1/2018 12:31 AM, Michal Hocko wrote: >>> On Thu 31-05-18 19:04:46, Qing Huang wrote: >>>> >>>> On 5/31/2018 2:10 AM, Michal Hocko wrote: >>>>> On Thu 31-05-18 10:55:32, Michal Hocko wrote: >>>>>> On Thu 31-05-18 04:35:31, Eric Dumazet wrote: >>>>> [...] >>>>>>> I merely copied/pasted from alloc_skb_with_frags() :/ >>>>>> I will have a look at it. Thanks! >>>>> OK, so this is an example of an incremental development ;). >>>>> >>>>> __GFP_NORETRY was added by ed98df3361f0 ("net: use __GFP_NORETRY for >>>>> high order allocations") to prevent from OOM killer. Yet this was >>>>> not enough because fb05e7a89f50 ("net: don't wait for order-3 page >>>>> allocation") didn't want an excessive reclaim for non-costly orders >>>>> so it made it completely NOWAIT while it preserved __GFP_NORETRY in >>>>> place which is now redundant. Should I send a patch? >>>>> >>>> Just curious, how about GFP_ATOMIC flag? Would it work in a similar fashion? >>>> We experimented >>>> with it a bit in the past but it seemed to cause other issue in our tests. >>>> :-) >>> GFP_ATOMIC is a non-sleeping (aka no reclaim) context with an access to >>> memory reserves. So the risk is that you deplete those reserves and >>> cause issues to other subsystems which need them as well. >>> >>>> By the way, we didn't encounter any OOM killer events. It seemed that the >>>> mlx4_alloc_icm() triggered slowpath. >>>> We still had about 2GB free memory while it was highly fragmented. >>> The compaction was able to make a reasonable forward progress for you. >>> But considering mlx4_alloc_icm is called with GFP_KERNEL resp. GFP_HIGHUSER >>> then the OOM killer is clearly possible as long as the order is lower >>> than 4. >> >> The allocation was 256KB so the order was much higher than 4. The compaction >> seemed to be the root >> cause for our problem. It took too long to finish its work while putting >> mlx4_alloc_icm to sleep in a heavily >> fragmented memory situation . Will NORETRY flag avoid the compaction ops and >> fail the 256KB allocation >> immediately so mlx4_alloc_icm can enter adjustable lower order allocation >> code path quickly? > > Costly orders should only perform a light compaction attempt unless > __GFP_RETRY_MAY_FAIL is used IIRC. CCing Vlastimil. So __GFP_NORETRY > shouldn't make any difference. It's a bit more complicated. Costly allocations will try the light compaction attempt first, even before reclaim. This is followed by reclaim and a more costly compaction attempt. With __GFP_NORETRY, the second compaction attempt is also only the light one, so the flag does make a difference here. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html