On Tue, Nov 30, 2010 at 12:15 AM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote: > On Tue, Nov 30, 2010 at 4:08 PM, KAMEZAWA Hiroyuki > <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote: >> On Mon, 29 Nov 2010 22:49:42 -0800 >> Ying Han <yinghan@xxxxxxxxxx> wrote: >> >>> There is a kswapd kernel thread for each memory node. We add a different kswapd >>> for each cgroup. The kswapd is sleeping in the wait queue headed at kswapd_wait >>> field of a kswapd descriptor. The kswapd descriptor stores information of node >>> or cgroup and it allows the global and per cgroup background reclaim to share >>> common reclaim algorithms. >>> >>> This patch addes the kswapd descriptor and changes per zone kswapd_wait to the >>> common data structure. >>> >>> Signed-off-by: Ying Han <yinghan@xxxxxxxxxx> >>> --- >>> include/linux/mmzone.h | 3 +- >>> include/linux/swap.h | 10 +++++ >>> mm/memcontrol.c | 2 + >>> mm/mmzone.c | 2 +- >>> mm/page_alloc.c | 9 +++- >>> mm/vmscan.c | 98 +++++++++++++++++++++++++++++++++-------------- >>> 6 files changed, 90 insertions(+), 34 deletions(-) >>> >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>> index 39c24eb..c77dfa2 100644 >>> --- a/include/linux/mmzone.h >>> +++ b/include/linux/mmzone.h >>> @@ -642,8 +642,7 @@ typedef struct pglist_data { >>> unsigned long node_spanned_pages; /* total size of physical page >>> range, including holes */ >>> int node_id; >>> - wait_queue_head_t kswapd_wait; >>> - struct task_struct *kswapd; >>> + wait_queue_head_t *kswapd_wait; >>> int kswapd_max_order; >>> } pg_data_t; >>> >>> diff --git a/include/linux/swap.h b/include/linux/swap.h >>> index eba53e7..2e6cb58 100644 >>> --- a/include/linux/swap.h >>> +++ b/include/linux/swap.h >>> @@ -26,6 +26,16 @@ static inline int current_is_kswapd(void) >>> return current->flags & PF_KSWAPD; >>> } >>> >>> +struct kswapd { >>> + struct task_struct *kswapd_task; >>> + wait_queue_head_t kswapd_wait; >>> + struct mem_cgroup *kswapd_mem; >>> + pg_data_t *kswapd_pgdat; >>> +}; >>> + >>> +#define MAX_KSWAPDS MAX_NUMNODES >>> +extern struct kswapd kswapds[MAX_KSWAPDS]; >>> +int kswapd(void *p); >> >> Why this is required ? Can't we allocate this at boot (if necessary) ? >> Why exsiting kswapd is also controlled under this structure ? >> At the 1st look, this just seem to increase the size of changes.... >> >> IMHO, implementing background-reclaim-for-memcg is cleaner than reusing kswapd.. >> kswapd has tons of unnecessary checks. > > Ideally, I hope we unify global and memcg of kswapd for easy > maintainance if it's not a big problem. I intended not doing so in this patchset since the algorithm and reclaiming target are different for global and per-memcg kswapd. I would prefer not having the new changes to affect existing logic. > When we make patches about lru pages, we always have to consider what > I should do for memcg. > And when we review patches, we also should consider what the patch is > missing for memcg. The per-memcg LRU is there and that needs to be considered differently as global one. This patchset doesn't change that part but is based on that. I don't see by merging the kswapd will help the maintainance in that sense. All the following changes to the per-memcg LRU should take effect automatically to the per-memcg kswapd later on. > It makes maintainance cost big. Of course, if memcg maintainers is > involved with all patches, it's no problem as it is. > > If it is impossible due to current kswapd's spaghetti, we can clean up > it first. I am not sure whether my suggestion make sense or not. > Kame can know it much rather than me. But please consider such the voice. The global kswapd is working on node and zones on the node. Its target is to bring all the zones above high wmarks unless the zones are "unreclaimable". The logic is different for per-memcg kswapd which scans all the nodes and zones on the system and tries to bring the per-memcg wmark above the threshold. Lots of heuristics are not shared at this moment, and I am not sure if this is a good idea to merge them. --Ying > >> >> Regards, >> -Kame >> > > > > -- > Kind regards, > Minchan Kim > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href