Re: [RFC v2] mm/damon/core: use number of passed access sampling as a timer

SeongJae Park <sj@xxxxxxxxxx> · Mon, 11 Sep 2023 02:34:21 +0000

On Tue, 5 Sep 2023 03:52:10 +0000 SeongJae Park <sj@xxxxxxxxxx> wrote:

> Changes from v1
> (https://lore.kernel.org/damon/20230827003727.49369-1-sj@xxxxxxxxxx/)
> - Initalize next_*_sis at the beginning of kdamond_fn()
> - Remove unnecessary remaining intervals compensations in
>   damon_set_attrs()
> 
> DAMON sleeps for sampling interval after each sampling, and check if the
> aggregation interval and the ops update interval have passed using
> ktime_get_coarse_ts64() and baseline timestamps for the intervals.  That
> design is for making the operations occur at deterministic timing
> regardless of the time that spend for each work.  However, it turned out
> it is not that useful, and incur not-that-intuitive results.
> 
> After all, timer functions, and especially sleep functions that DAMON
> uses to wait for specific timing, are not necessarily strictly accurate.
> It is legal design, so no problem.  However, depending on such
> inaccuracies, the nr_accesses can be larger than aggregation interval
> divided by sampling interval.  For example, with the default setting (5
> ms sampling interval and 100 ms aggregation interval) we frequently show
> regions having nr_accesses larger than 20.  Also, if the execution of a
> DAMOS scheme takes a long time, next aggregation could happen before
> enough number of samples are collected.  This is not what usual users
> would intuitively expect.
> 
> Since access check sampling is the smallest unit work of DAMON, using
> the number of passed sampling intervals as the DAMON-internal timer can
> easily avoid these problems.  That is, convert aggregation and ops
> update intervals to numbers of sampling intervals that need to be passed
> before those operations be executed, count the number of passed sampling
> intervals, and invoke the operations as soon as the specific amount of
> sampling intervals passed.  Make the change.
> 
> Note that this could make a behavioral change to settings that using
> intervals that not aligned by the sampling interval.  For example, if
> the sampling interval is 5 ms and the aggregation interval is 12 ms,
> DAMON effectively used 15 ms as its aggregation interval with the old
> design, hence it checks whether the aggregation interval after sleeping
> the sampling interval.  This change will make DAMON to effectively use
> 10 ms as aggregation interval, since it uses 'aggregation interval /
> sampling interval * sampling interval' as the effective aggregation
> interval, and we don't use floating point types.  Usual users would have
> used aligned intervals, so this behavioral change is not expected to
> make any meaningful impact, so just make this change.
> 
> Signed-off-by: SeongJae Park <sj@xxxxxxxxxx>
> ---
>  include/linux/damon.h | 14 ++++++-
>  mm/damon/core.c       | 87 +++++++++++++++++++------------------------
>  2 files changed, 51 insertions(+), 50 deletions(-)
> 
[...]
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index b895f70acb2d..d64ddaa472ca 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
[...]
> @@ -581,6 +583,8 @@ static void damon_update_monitoring_results(struct damon_ctx *ctx,
>   */
>  int damon_set_attrs(struct damon_ctx *ctx, struct damon_attrs *attrs)
>  {
> +	unsigned long sample_interval;
> +
>  	if (attrs->min_nr_regions < 3)
>  		return -EINVAL;
>  	if (attrs->min_nr_regions > attrs->max_nr_regions)
> @@ -588,6 +592,12 @@ int damon_set_attrs(struct damon_ctx *ctx, struct damon_attrs *attrs)
>  	if (attrs->sample_interval > attrs->aggr_interval)
>  		return -EINVAL;
>  
> +	sample_interval = attrs->sample_interval ? attrs->sample_interval : 1;
> +	ctx->next_aggregation_sis = ctx->passed_sample_intervals +
> +		attrs->aggr_interval / sample_interval;
> +	ctx->next_ops_update_sis = ctx->passed_sample_intervals +
> +		attrs->ops_update_interval / sample_interval;
> +

So, damon_set_attrs() can be called from after_wamrks_check() or
after_aggregation() callbacks while kdamond is running.  And, it updates the
next_{aggregation,ops_update}_sis.

>  	damon_update_monitoring_results(ctx, attrs);
>  	ctx->attrs = *attrs;
>  	return 0;
> @@ -761,38 +771,6 @@ int damon_stop(struct damon_ctx **ctxs, int nr_ctxs)
>  	return err;
>  }
>  
[...]
> @@ -1432,6 +1409,8 @@ static int kdamond_fn(void *data)
>  
>  	pr_debug("kdamond (%d) starts\n", current->pid);
>  
> +	kdamond_init_intervals_sis(ctx);
> +
>  	if (ctx->ops.init)
>  		ctx->ops.init(ctx);
>  	if (ctx->callback.before_start && ctx->callback.before_start(ctx))
> @@ -1440,6 +1419,8 @@ static int kdamond_fn(void *data)
>  	sz_limit = damon_region_sz_limit(ctx);
>  
>  	while (!kdamond_need_stop(ctx)) {
> +		unsigned long sample_interval;
> +
>  		if (kdamond_wait_activation(ctx))

The after_wmarks_check() callback is called from kdamond_wait_activation().
Hence, the user might call damon_set_attrs() there, and
next_{aggr_interval,ops_update}_sis can be changed here.

>  			break;
>  
> @@ -1450,11 +1431,17 @@ static int kdamond_fn(void *data)
>  			break;
>  
>  		kdamond_usleep(ctx->attrs.sample_interval);
> +		ctx->passed_sample_intervals++;
>  
>  		if (ctx->ops.check_accesses)
>  			max_nr_accesses = ctx->ops.check_accesses(ctx);
>  
> -		if (kdamond_aggregate_interval_passed(ctx)) {
> +		sample_interval = ctx->attrs.sample_interval ?
> +			ctx->attrs.sample_interval : 1;
> +		if (ctx->passed_sample_intervals ==
> +				ctx->next_aggregation_sis) {

This branch reads next_aggregation_sis, which could be changed from above
after_wmarks_check() call.

This branch also executes after_aggregation() callback, which again can call
damon_set_attrs() and therefore update next_ops_update_sis.

> +			ctx->next_aggregation_sis +=
> +				ctx->attrs.aggr_interval / sample_interval;
>  			kdamond_merge_regions(ctx,
>  					max_nr_accesses / 10,
>  					sz_limit);
> @@ -1469,7 +1456,11 @@ static int kdamond_fn(void *data)
>  				ctx->ops.reset_aggregated(ctx);
>  		}
>  
> -		if (kdamond_need_update_operations(ctx)) {
> +		if (ctx->passed_sample_intervals ==
> +				ctx->next_ops_update_sis) {

This branch reads next_ops_update_sis, which could be changed from above
after_aggregation() or after_wmarks_check() callbacks.

As a result, we can unexpectedly skip aggregation or ops update executions.
This could be problem for later changes including pseudo-moving sum based
nr_accesses[1] and DAMOS apply intervals[2].  Let's read
next_{aggregation,ops_update}_sis at the beginning of the loop and use those.

[1] https://lore.kernel.org/damon/20230909033711.55794-9-sj@xxxxxxxxxx/
[2] https://lore.kernel.org/damon/20230910034048.59191-1-sj@xxxxxxxxxx/

> +			ctx->next_ops_update_sis +=
> +				ctx->attrs.ops_update_interval /
> +				sample_interval;
>  			if (ctx->ops.update)
>  				ctx->ops.update(ctx);
>  			sz_limit = damon_region_sz_limit(ctx);
> -- 
> 2.25.1
> 
>