Re: [PATCH] memcg: add interface to specify thresholds of vmpressure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat 22-06-13 16:34:34, Hyunhee Kim wrote:
> Memory pressure is calculated based on scanned/reclaimed ratio.

It is done that way _now_ and there is no guarantee it will do that in
future. There was a reason why the interface is so mean on any details.

I am sorry to repeat myself but this is a user interface and we will have
to maintain it for _ever_. We cannot export random knobs that work just
now. Future implementation of the reclaim might change considerably and
scaned vs. reclaimed might no longer mean the same thing.

So no, again, please do not try to push random things to handle you
current and very specific use case.

Nack to this patch.

> The higher
> the value, the more number unsuccessful reclaims there were. These thresholds
> can be specified when each event is registered by writing it next to the
> string of level. Default value is 60 for "medium" and 95 for "critical"
> 
> Signed-off-by: Hyunhee Kim <hyunhee.kim@xxxxxxxxxxx>
> Signed-off-by: Kyungmin Park <kyungmin.park@xxxxxxxxxxx>
> ---
>  Documentation/cgroups/memory.txt |    8 +++++-
>  mm/vmpressure.c                  |   54 +++++++++++++++++++++++++++-----------
>  2 files changed, 45 insertions(+), 17 deletions(-)
> 
> diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
> index ddf4f93..bd9cf46 100644
> --- a/Documentation/cgroups/memory.txt
> +++ b/Documentation/cgroups/memory.txt
> @@ -807,13 +807,19 @@ register a notification, an application must:
>  
>  - create an eventfd using eventfd(2);
>  - open memory.pressure_level;
> -- write string like "<event_fd> <fd of memory.pressure_level> <level>"
> +- write string like "<event_fd> <fd of memory.pressure_level> <level> <threshold>"
>    to cgroup.event_control.
>  
>  Application will be notified through eventfd when memory pressure is at
>  the specific level (or higher). Read/write operations to
>  memory.pressure_level are no implemented.
>  
> +We account memory pressure based on scanned/reclaimed ratio. The higher
> +the value, the more number unsuccessful reclaims there were. These thresholds
> +can be specified when each event is registered by writing it next to the
> +string of level. Default value is 60 for "medium" and 95 for "critical".
> +If nothing is input as threshold, default values are used.
> +
>  Test:
>  
>     Here is a small script example that makes a new cgroup, sets up a
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 736a601..52b266c 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -40,15 +40,6 @@
>  static const unsigned long vmpressure_win = SWAP_CLUSTER_MAX * 16;
>  
>  /*
> - * These thresholds are used when we account memory pressure through
> - * scanned/reclaimed ratio. The current values were chosen empirically. In
> - * essence, they are percents: the higher the value, the more number
> - * unsuccessful reclaims there were.
> - */
> -static const unsigned int vmpressure_level_med = 60;
> -static const unsigned int vmpressure_level_critical = 95;
> -
> -/*
>   * When there are too little pages left to scan, vmpressure() may miss the
>   * critical pressure as number of pages will be less than "window size".
>   * However, in that case the vmscan priority will raise fast as the
> @@ -97,6 +88,19 @@ enum vmpressure_levels {
>  	VMPRESSURE_NUM_LEVELS,
>  };
>  
> +/*
> + * These thresholds are used when we account memory pressure through
> + * scanned/reclaimed ratio. In essence, they are percents: the higher
> + * the value, the more number unsuccessful reclaims there were.
> + * These thresholds can be specified when each event is registered.
> + */
> +
> +static unsigned int vmpressure_threshold_levels[] = {
> +	[VMPRESSURE_LOW] = 0,
> +	[VMPRESSURE_MEDIUM] = 60,
> +	[VMPRESSURE_CRITICAL] = 95,
> +};
> +
>  static const char * const vmpressure_str_levels[] = {
>  	[VMPRESSURE_LOW] = "low",
>  	[VMPRESSURE_MEDIUM] = "medium",
> @@ -105,11 +109,14 @@ static const char * const vmpressure_str_levels[] = {
>  
>  static enum vmpressure_levels vmpressure_level(unsigned long pressure)
>  {
> -	if (pressure >= vmpressure_level_critical)
> -		return VMPRESSURE_CRITICAL;
> -	else if (pressure >= vmpressure_level_med)
> -		return VMPRESSURE_MEDIUM;
> -	return VMPRESSURE_LOW;
> +	int level;
> +
> +	for (level = VMPRESSURE_NUM_LEVELS - 1; level >= 0; level--) {
> +		if (pressure >= vmpressure_threshold_levels[level])
> +			break;
> +	}
> +
> +	return level;
>  }
>  
>  static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
> @@ -303,10 +310,21 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
>  {
>  	struct vmpressure *vmpr = cg_to_vmpressure(cg);
>  	struct vmpressure_event *ev;
> -	int level;
> +	char *strlevel, *strthres;
> +	int level, thres = -1;
> +
> +	strlevel = args;
> +	strthres = strchr(args, ' ');
> +
> +	if (strthres) {
> +		*strthres = '\0';
> +		strthres++;
> +		if(kstrtoint(strthres, 10, &thres))
> +			return -EINVAL;
> +	}
>  
>  	for (level = 0; level < VMPRESSURE_NUM_LEVELS; level++) {
> -		if (!strcmp(vmpressure_str_levels[level], args))
> +		if (!strcmp(vmpressure_str_levels[level], strlevel))
>  			break;
>  	}
>  
> @@ -320,6 +338,10 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
>  	ev->efd = eventfd;
>  	ev->level = level;
>  
> +	/* If user input threshold is not valid value, use default value */
> +	if (thres <= 100 && thres >= 0)
> +		vmpressure_threshold_levels[level] = thres;
> +
>  	mutex_lock(&vmpr->events_lock);
>  	list_add(&ev->node, &vmpr->events);
>  	mutex_unlock(&vmpr->events_lock);
> -- 
> 1.7.9.5
> 
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]