On Thu 18-06-15 17:22:40, Vlastimil Babka wrote: > On 06/18/2015 04:43 PM, Michal Hocko wrote: > >On Thu 18-06-15 07:35:53, Eric Dumazet wrote: > >>On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko <mhocko@xxxxxxx> wrote: > >> > >>>Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the > >>>_current_ implementation of the allocator has this nasty and very subtle > >>>side effect but that doesn't mean it should be abused outside of the mm > >>>proper. Why shouldn't this path wake the kswapd and let it compact > >>>memory on the background to increase the success rate for the later > >>>high order allocations? > >> > >>I kind of agree. > >> > >>If kswapd is a problem (is it ???) we should fix it, instead of adding > >>yet another flag to some random locations attempting > >>memory allocations. > > > >No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can > >access some portion of the memory reserves (see gfp_to_alloc_flags resp. > >__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty > >hack to not give that access which was introduced for THP AFAIR. > > > >The implicit access to memory reserves for non sleeping allocation has > >been there for ages and it might be not suitable for this particular > >path but that doesn't mean another gfp flag with a different side effect > >should be hijacked. We should either stop doing that implicit access to > >memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But > >that is a problem to be solved in the mm proper. Spreading subtle > >dependencies outside of mm will just make situation worse. > > So you are not proposing to use these __GFP_RESERVE/NORESERVE flag outside > of mm, right? (besides, we distinguish several kinds of reserves, so what > exactly would the flag do?) That is to be discussed. Most allocations already express their interest in memory reserves by __GFP_HIGH directly or by GFP_ATOMIC indirectly. So maybe we do not need any additional flag here. There are not that many ~__GFP_WAIT and most of them seem to require it _only_ because the context doesn't allow for sleeping (e.g. to prevent from deadlocks). > As that would be also subtle dependency. The > general problem I think is that we should want the mm users to specify > higher-level intentions (such as GFP_KERNEL) which would map to specific > directions (__GFP_*) for the allocator, and currently it's rather a mess of > both kinds of flags. I agree. So I think that maybe we should drop that implicit access to memory reserves for ~__GFP_WAIT allocations and let it do what it is documented to do. > Clearly the intention here is "opportunistic allocation that should > not reclaim/compact, use reserves, wake up kswapd (?) because it's > better to fall back to smaller pages than wait") and we don't seem to > have a GFP_OPPORTUNISTIC flag for that. The allocation has to then > mask out __GFP_WAIT which however looks like an atomic allocation to > the allocator and give access to reserves, etc... I think simply dropping GFP_WAIT is a good way to express that. The fact that the current implementation gives access to memory reserves implicitly is just a detail and the user of the allocator shouldn't care about that. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>