Re: [PATCH 2/5] Add per cgroup reclaim watermarks.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 13, 2011 at 4:11 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> On Thu, 13 Jan 2011 14:00:32 -0800
> Ying Han <yinghan@xxxxxxxxxx> wrote:
>
>> The per cgroup kswapd is invoked when the cgroup's free memory (limit - usage)
>> is less than a threshold--low_wmark. Then the kswapd thread starts to reclaim
>> pages in a priority loop similar to global algorithm. The kswapd is done if the
>> free memory is above a threshold--high_wmark.
>>
>> The per cgroup background reclaim is based on the per cgroup LRU and also adds
>> per cgroup watermarks. There are two watermarks including "low_wmark" and
>> "high_wmark", and they are calculated based on the limit_in_bytes(hard_limit)
>> for each cgroup. Each time the hard_limit is changed, the corresponding wmarks
>> are re-calculated. Since memory controller charges only user pages, there is
>> no need for a "min_wmark". The current calculation of wmarks is a function of
>> "memory.min_free_kbytes" which could be adjusted by writing different values
>> into the new api. This is added mainly for debugging purpose.
>>
>> Change log v2...v1:
>> 1. Remove the res_counter_charge on wmark due to performance concern.
>> 2. Move the new APIs min_free_kbytes, reclaim_wmarks into seperate commit.
>> 3. Calculate the min_free_kbytes automatically based on the limit_in_bytes.
>> 4. make the wmark to be consistant with core VM which checks the free pages
>> instead of usage.
>> 5. changed wmark to be boolean
>>
>> Signed-off-by: Ying Han <yinghan@xxxxxxxxxx>
>

Thank you  KAMEZAWA for your comments.

> Hmm, I don't think using the same algorithm as min_free_kbytes is good.
>
> Why it's bad to have 2 interfaces as low_wmark and high_wmark ?


>
> And in this patch, min_free_kbytes can be [256...65536]...I think this
> '256' is not good because it should be able to be set to '0'.
>
> IIUC, in enterprise systems, there are users who want to keep a fixed amount
> of free memory always. This interface will not allow such use case.

>
> I think we should have 2 interfaces as low_wmark and high_wmark. But as default
> value, the same value as to the alogorithm with min_free_kbytes will make sense.

I agree that "min_free_kbytes" concept doesn't apply well since there
is no notion of "reserved pool" in memcg. I borrowed it at the
beginning is to add a tunable to the per-memcg watermarks besides the
hard_limit. I read the
patch posted from Satoru Moriya "Tunable watermarks", and introducing
the per-memcg-per-watermark tunable
sounds good to me. Might consider adding it to the next post.

>
> BTW, please divide res_counter part and memcg part in the next post.

Will do.

>
> Please explain your handling of 'hierarchy' in description.
I haven't thought through the 'hierarchy' handling in this patchset
which I will probably put more thoughts in the following
posts. Do you have recommendations on handing the 'hierarchy' ?

--Ying

>
> Thanks,
> -Kame
>
>
>> ---
>>  include/linux/memcontrol.h  |    1 +
>>  include/linux/res_counter.h |   83 +++++++++++++++++++++++++++++++++++++++++++
>>  kernel/res_counter.c        |    6 +++
>>  mm/memcontrol.c             |   73 +++++++++++++++++++++++++++++++++++++
>>  4 files changed, 163 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index 3433784..80a605f 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -93,6 +93,7 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem);
>>
>>  extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
>>  extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
>> +extern int mem_cgroup_watermark_ok(struct mem_cgroup *mem, int charge_flags);
>>
>>  static inline
>>  int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
>> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
>> index fcb9884..10b7e59 100644
>> --- a/include/linux/res_counter.h
>> +++ b/include/linux/res_counter.h
>> @@ -39,6 +39,15 @@ struct res_counter {
>>        */
>>       unsigned long long soft_limit;
>>       /*
>> +      * the limit that reclaim triggers. it is the free count
>> +      * (limit - usage)
>> +      */
>> +     unsigned long long low_wmark_limit;
>> +     /*
>> +      * the limit that reclaim stops. it is the free count
>> +      */
>> +     unsigned long long high_wmark_limit;
>> +     /*
>>        * the number of unsuccessful attempts to consume the resource
>>        */
>>       unsigned long long failcnt;
>> @@ -55,6 +64,9 @@ struct res_counter {
>>
>>  #define RESOURCE_MAX (unsigned long long)LLONG_MAX
>>
>> +#define CHARGE_WMARK_LOW     0x02
>> +#define CHARGE_WMARK_HIGH    0x04
>> +
>>  /**
>>   * Helpers to interact with userspace
>>   * res_counter_read_u64() - returns the value of the specified member


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]