On Sat 17-12-16 19:44:22, Tetsuo Handa wrote: > On 2016/12/15 23:07, Michal Hocko wrote: > > GFP_NOFS context is used for the following 5 reasons currently > > - to prevent from deadlocks when the lock held by the allocation > > context would be needed during the memory reclaim > > - to prevent from stack overflows during the reclaim because > > the allocation is performed from a deep context already > > - to prevent lockups when the allocation context depends on > > other reclaimers to make a forward progress indirectly > > - just in case because this would be safe from the fs POV > > - silence lockdep false positives > > > > Unfortunately overuse of this allocation context brings some problems > > to the MM. Memory reclaim is much weaker (especially during heavy FS > > metadata workloads), OOM killer cannot be invoked because the MM layer > > doesn't have enough information about how much memory is freeable by the > > FS layer. > > This series is intended for simply applying "& ~__GFP_FS" mask to allocations > which are using GFP_KERNEL by error for the current thread, isn't it? Not really. I've tried to cover that in changelogs but in short I would like to achieve a state where this api would cover all the recursion dangerous places with a documentation why and most/all the specific allocations will not care about NOFS at all. They will simply inherit NOFS scope when necessary. > > In many cases it is far from clear why the weaker context is even used > > and so it might be used unnecessarily. We would like to get rid of > > those as much as possible. One way to do that is to use the flag in > > scopes rather than isolated cases. Such a scope is declared when really > > necessary, tracked per task and all the allocation requests from within > > the context will simply inherit the GFP_NOFS semantic. > > > > Not only this is easier to understand and maintain because there are > > much less problematic contexts than specific allocation requests, this > > also helps code paths where FS layer interacts with other layers (e.g. > > crypto, security modules, MM etc...) and there is no easy way to convey > > the allocation context between the layers. > > I haven't heard an answer to "a terrible thing" in > http://lkml.kernel.org/r/20160427200530.GB22544@xxxxxxxxxxxxxx . > > What is your plan for checking whether we need to propagate "& ~__GFP_FS" > mask to other threads which current thread waits synchronously (e.g. > wait_for_completion()) from "& ~__GFP_FS" context? This needs a deeper inspection. First of all we have to find out whether we have a _relevant_ code which depends on kworkers (without WQ_MEM_RECLAIM) from the NOFS context. This is not covered in this patch series, though. I plan to get to it later after we actually finish this step. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>