Re: [PATCH 1/4] Add kswapd descriptor.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 30, 2010 at 12:15 AM, Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
> On Tue, Nov 30, 2010 at 4:08 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
>> On Mon, 29 Nov 2010 22:49:42 -0800
>> Ying Han <yinghan@xxxxxxxxxx> wrote:
>>
>>> There is a kswapd kernel thread for each memory node. We add a different kswapd
>>> for each cgroup. The kswapd is sleeping in the wait queue headed at kswapd_wait
>>> field of a kswapd descriptor. The kswapd descriptor stores information of node
>>> or cgroup and it allows the global and per cgroup background reclaim to share
>>> common reclaim algorithms.
>>>
>>> This patch addes the kswapd descriptor and changes per zone kswapd_wait to the
>>> common data structure.
>>>
>>> Signed-off-by: Ying Han <yinghan@xxxxxxxxxx>
>>> ---
>>>  include/linux/mmzone.h |    3 +-
>>>  include/linux/swap.h   |   10 +++++
>>>  mm/memcontrol.c        |    2 +
>>>  mm/mmzone.c            |    2 +-
>>>  mm/page_alloc.c        |    9 +++-
>>>  mm/vmscan.c            |   98 +++++++++++++++++++++++++++++++++--------------
>>>  6 files changed, 90 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 39c24eb..c77dfa2 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -642,8 +642,7 @@ typedef struct pglist_data {
>>>       unsigned long node_spanned_pages; /* total size of physical page
>>>                                            range, including holes */
>>>       int node_id;
>>> -     wait_queue_head_t kswapd_wait;
>>> -     struct task_struct *kswapd;
>>> +     wait_queue_head_t *kswapd_wait;
>>>       int kswapd_max_order;
>>>  } pg_data_t;
>>>
>>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>>> index eba53e7..2e6cb58 100644
>>> --- a/include/linux/swap.h
>>> +++ b/include/linux/swap.h
>>> @@ -26,6 +26,16 @@ static inline int current_is_kswapd(void)
>>>       return current->flags & PF_KSWAPD;
>>>  }
>>>
>>> +struct kswapd {
>>> +     struct task_struct *kswapd_task;
>>> +     wait_queue_head_t kswapd_wait;
>>> +     struct mem_cgroup *kswapd_mem;
>>> +     pg_data_t *kswapd_pgdat;
>>> +};
>>> +
>>> +#define MAX_KSWAPDS MAX_NUMNODES
>>> +extern struct kswapd kswapds[MAX_KSWAPDS];
>>> +int kswapd(void *p);
>>
>> Why this is required ? Can't we allocate this at boot (if necessary) ?
>> Why exsiting kswapd is also controlled under this structure ?
>> At the 1st look, this just seem to increase the size of changes....
>>
>> IMHO, implementing background-reclaim-for-memcg is cleaner than reusing kswapd..
>> kswapd has tons of unnecessary checks.
>
> Ideally, I hope we unify global and memcg of kswapd for easy
> maintainance if it's not a big problem.

I intended not doing so in this patchset since the algorithm and
reclaiming target are
different for global and per-memcg kswapd. I would prefer not having
the new changes
to affect existing logic.

> When we make patches about lru pages, we always have to consider what
> I should do for memcg.
> And when we review patches, we also should consider what the patch is
> missing for memcg.
The per-memcg LRU is there and that needs to be considered differently
as global one. This
patchset doesn't change that part but is based on that. I don't see by
merging the kswapd will
help the maintainance in that sense. All the following changes to the
per-memcg LRU should take
effect automatically to the per-memcg kswapd later on.

> It makes maintainance cost big. Of course, if memcg maintainers is
> involved with all patches, it's no problem as it is.

>
> If it is impossible due to current kswapd's spaghetti, we can clean up
> it first. I am not sure whether my suggestion make sense or not.
> Kame can know it much rather than me. But please consider such the voice.

The global kswapd is working on node and zones on the node. Its target
is to bring
all the zones above high wmarks unless the zones are "unreclaimable".
The logic is
different for per-memcg kswapd which scans all the nodes and zones on the system
and tries to bring the per-memcg wmark above the threshold. Lots of
heuristics are
not shared at this moment, and I am not sure if this is a good idea to
merge them.

--Ying
>
>>
>> Regards,
>> -Kame
>>
>
>
>
> --
> Kind regards,
> Minchan Kim
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]