On 06/18/2015 04:43 PM, Michal Hocko wrote:
On Thu 18-06-15 07:35:53, Eric Dumazet wrote:
On Thu, Jun 18, 2015 at 7:30 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
Abusing __GFP_NO_KSWAPD is a wrong way to go IMHO. It is true that the
_current_ implementation of the allocator has this nasty and very subtle
side effect but that doesn't mean it should be abused outside of the mm
proper. Why shouldn't this path wake the kswapd and let it compact
memory on the background to increase the success rate for the later
high order allocations?
I kind of agree.
If kswapd is a problem (is it ???) we should fix it, instead of adding
yet another flag to some random locations attempting
memory allocations.
No, kswapd is not a problem. The problem is ~__GFP_WAIT allocation can
access some portion of the memory reserves (see gfp_to_alloc_flags resp.
__zone_watermark_ok and ALLOC_HARDER). __GFP_NO_KSWAPD is just a dirty
hack to not give that access which was introduced for THP AFAIR.
The implicit access to memory reserves for non sleeping allocation has
been there for ages and it might be not suitable for this particular
path but that doesn't mean another gfp flag with a different side effect
should be hijacked. We should either stop doing that implicit access to
memory reserves and give __GFP_RESERVE or add the __GFP_NORESERVE. But
that is a problem to be solved in the mm proper. Spreading subtle
dependencies outside of mm will just make situation worse.
So you are not proposing to use these __GFP_RESERVE/NORESERVE flag
outside of mm, right? (besides, we distinguish several kinds of
reserves, so what exactly would the flag do?) As that would be also
subtle dependency. The general problem I think is that we should want
the mm users to specify higher-level intentions (such as GFP_KERNEL)
which would map to specific directions (__GFP_*) for the allocator, and
currently it's rather a mess of both kinds of flags. Clearly the
intention here is "opportunistic allocation that should not
reclaim/compact, use reserves, wake up kswapd (?) because it's better to
fall back to smaller pages than wait") and we don't seem to have a
GFP_OPPORTUNISTIC flag for that. The allocation has to then mask out
__GFP_WAIT which however looks like an atomic allocation to the
allocator and give access to reserves, etc...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>