Added replies to your every comment in line below. I agree to your whole opinions, will apply those in next spin! :) On Tue, 10 Mar 2020 08:57:21 +0000 Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > On Mon, 24 Feb 2020 13:30:35 +0100 > SeongJae Park <sjpark@xxxxxxxxxx> wrote: > > > From: SeongJae Park <sjpark@xxxxxxxxx> > > > > This commit implements DAMON's basic access check and region based > > sampling mechanisms. This change would seems make no sense, mainly > > because it is only a part of the DAMON's logics. Following two commits > > will make more sense. > > > > This commit also exports `lookup_page_ext()` to GPL modules because > > DAMON uses the function but also supports the module build. > > Do that as a separate patch before this one. Makes it easy to spot. Agreed, will do so. > > > [...] > > Various things inline. In particularly can you make use of standard > kthread_stop infrastructure rather than rolling your own? Nice suggestion! That will be much better, will use it. > > > --- > > mm/damon.c | 509 ++++++++++++++++++++++++++++++++++++++++++++++++++ > > mm/page_ext.c | 1 + > > 2 files changed, 510 insertions(+) > > > > diff --git a/mm/damon.c b/mm/damon.c > > index aafdca35b7b8..6bdeb84d89af 100644 > > --- a/mm/damon.c > > +++ b/mm/damon.c > > @@ -9,9 +9,14 @@ > > [...] > > +/* > > + * Get the mm_struct of the given task > > + * > > + * Callser should put the mm_struct after use, unless it is NULL. > > Caller Good eye! Will fix it. > > > + * > > + * Returns the mm_struct of the task on success, NULL on failure > > + */ > > +static struct mm_struct *damon_get_mm(struct damon_task *t) > > +{ > > + struct task_struct *task; > > + struct mm_struct *mm; > > + > > + task = damon_get_task_struct(t); > > + if (!task) > > + return NULL; > > + > > + mm = get_task_mm(task); > > + put_task_struct(task); > > + return mm; > > +} > > + > > +/* > > + * Size-evenly split a region into 'nr_pieces' small regions > > + * > > + * Returns 0 on success, or negative error code otherwise. > > + */ > > +static int damon_split_region_evenly(struct damon_ctx *ctx, > > + struct damon_region *r, unsigned int nr_pieces) > > +{ > > + unsigned long sz_orig, sz_piece, orig_end; > > + struct damon_region *piece = NULL, *next; > > + unsigned long start; > > + > > + if (!r || !nr_pieces) > > + return -EINVAL; > > + > > + orig_end = r->vm_end; > > + sz_orig = r->vm_end - r->vm_start; > > + sz_piece = sz_orig / nr_pieces; > > + > > + if (!sz_piece) > > + return -EINVAL; > > + > > + r->vm_end = r->vm_start + sz_piece; > > + next = damon_next_region(r); > > + for (start = r->vm_end; start + sz_piece <= orig_end; > > + start += sz_piece) { > > + piece = damon_new_region(ctx, start, start + sz_piece); > > + damon_add_region(piece, r, next); > > + r = piece; > > + } > > I'd add a comment here. I think this next bit is to catch any rounding error > holes, but I'm not 100% sure. Yes, will make it clearer. > > > + if (piece) > > + piece->vm_end = orig_end; > > blank line here. Will add. > > > + return 0; > > +} [...] > > +/* > > + * Initialize the monitoring target regions for the given task > > + * > > + * t the given target task > > + * > > + * Because only a number of small portions of the entire address space > > + * is acutally mapped to the memory and accessed, monitoring the unmapped > > actually Good eye! Will consider adding these in 'scripts/spelling.txt'. > [...] > > +/* > > + * Check whether the given region has accessed since the last check > > Should also make clear that this sets us up for the next access check at > a different memory address it the region. > > Given the lack of connection between activities perhaps just split this into > two functions that are always called next to each other. Will make the description more clearer as suggested. Also, I found that I'm not clearing *pte and *pmd before going 'mkold', thanks to this comment. Will fix it, either. > > > + * > > + * mm 'mm_struct' for the given virtual address space > > + * r the region to be checked > > + */ > > +static void kdamond_check_access(struct damon_ctx *ctx, > > + struct mm_struct *mm, struct damon_region *r) > > +{ > > + pte_t *pte = NULL; > > + pmd_t *pmd = NULL; > > + spinlock_t *ptl; > > + > > + if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl)) > > + goto mkold; > > + > > + /* Read the page table access bit of the page */ > > + if (pte && pte_young(*pte)) > > + r->nr_accesses++; > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > Is it worth having this protection? Seems likely to have only a very small > influence on performance and makes it a little harder to reason about the code. It was necessary for addressing 'implicit declaration' problem of 'pmd_young()' and 'pmd_mkold()' for build of DAMON on several architectures including User Mode Linux. Will modularize the code for better readability. > > > + else if (pmd && pmd_young(*pmd)) > > + r->nr_accesses++; > > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > + > > + spin_unlock(ptl); > > + > > +mkold: > > + /* mkold next target */ > > + r->sampling_addr = damon_rand(ctx, r->vm_start, r->vm_end); > > + > > + if (follow_pte_pmd(mm, r->sampling_addr, NULL, &pte, &pmd, &ptl)) > > + return; > > + > > + if (pte) { > > + if (pte_young(*pte)) { > > + clear_page_idle(pte_page(*pte)); > > + set_page_young(pte_page(*pte)); > > + } > > + *pte = pte_mkold(*pte); > > + } > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > + else if (pmd) { > > + if (pmd_young(*pmd)) { > > + clear_page_idle(pmd_page(*pmd)); > > + set_page_young(pmd_page(*pmd)); > > + } > > + *pmd = pmd_mkold(*pmd); > > + } > > +#endif > > + > > + spin_unlock(ptl); > > +} > > + > > +/* > > + * Check whether a time interval is elapsed > > Another comment block that would be clearer if it was kernel-doc rather > than nearly kernel-doc Will apply the kernel-doc syntax. > > > + * > > + * baseline the time to check whether the interval has elapsed since > > + * interval the time interval (microseconds) > > + * > > + * See whether the given time interval has passed since the given baseline > > + * time. If so, it also updates the baseline to current time for next check. > > + * > > + * Returns true if the time interval has passed, or false otherwise. > > + */ > > +static bool damon_check_reset_time_interval(struct timespec64 *baseline, > > + unsigned long interval) > > +{ > > + struct timespec64 now; > > + > > + ktime_get_coarse_ts64(&now); > > + if ((timespec64_to_ns(&now) - timespec64_to_ns(baseline)) < > > + interval * 1000) > > + return false; > > + *baseline = now; > > + return true; > > +} > > + > > +/* > > + * Check whether it is time to flush the aggregated information > > + */ > > +static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx) > > +{ > > + return damon_check_reset_time_interval(&ctx->last_aggregation, > > + ctx->aggr_interval); > > +} > > + > > +/* > > + * Reset the aggregated monitoring results > > + */ > > +static void kdamond_flush_aggregated(struct damon_ctx *c) > > I wouldn't expect a reset function to be called flush. It will work as flushing in next commit, but it makes no sense now. Will rename it. > > > +{ > > + struct damon_task *t; > > + struct damon_region *r; > > + > > + damon_for_each_task(c, t) { > > + damon_for_each_region(r, t) > > + r->nr_accesses = 0; > > + } > > +} > > + > > +/* > > + * Check whether current monitoring should be stopped > > + * > > + * If users asked to stop, need stop. Even though no user has asked to stop, > > + * need stop if every target task has dead. > > + * > > + * Returns true if need to stop current monitoring. > > + */ > > +static bool kdamond_need_stop(struct damon_ctx *ctx) > > +{ > > + struct damon_task *t; > > + struct task_struct *task; > > + bool stop; > > + > > As below comment asks, can you use kthread_should_stop? Yes, I will. > > > + spin_lock(&ctx->kdamond_lock); > > + stop = ctx->kdamond_stop; > > + spin_unlock(&ctx->kdamond_lock); > > + if (stop) > > + return true; > > + > > + damon_for_each_task(ctx, t) { > > + task = damon_get_task_struct(t); > > + if (task) { > > + put_task_struct(task); > > + return false; > > + } > > + } > > + > > + return true; > > +} > > + > > +/* > > + * The monitoring daemon that runs as a kernel thread > > + */ > > +static int kdamond_fn(void *data) > > +{ > > + struct damon_ctx *ctx = (struct damon_ctx *)data; > > Never any need to explicitly cast a void * to some other pointer type. > (C spec) Ah, you're right. > > struct damon_ctx *ctx = data; > > + struct damon_task *t; > > + struct damon_region *r, *next; > > + struct mm_struct *mm; > > + > > + pr_info("kdamond (%d) starts\n", ctx->kdamond->pid); > > + kdamond_init_regions(ctx); > > + while (!kdamond_need_stop(ctx)) { > > + damon_for_each_task(ctx, t) { > > + mm = damon_get_mm(t); > > + if (!mm) > > + continue; > > + damon_for_each_region(r, t) > > + kdamond_check_access(ctx, mm, r); > > + mmput(mm); > > + } > > + > > + if (kdamond_aggregate_interval_passed(ctx)) > > + kdamond_flush_aggregated(ctx); > > + > > + usleep_range(ctx->sample_interval, ctx->sample_interval + 1); > > Is there any purpose in using a range for such a narrow window? Actually, it needs to sleep only 'ctx->sample_interval', and thus I set the interval so narrow. > > > + } > > + damon_for_each_task(ctx, t) { > > + damon_for_each_region_safe(r, next, t) > > + damon_destroy_region(r); > > + } > > + pr_info("kdamond (%d) finishes\n", ctx->kdamond->pid); > > Feels like noise. I'd drop tis to pr_debug. Agreed, will remove it. > > > + spin_lock(&ctx->kdamond_lock); > > + ctx->kdamond = NULL; > > + spin_unlock(&ctx->kdamond_lock); > > blank line. Yup! > > > + return 0; > > +} > > + > > +/* > > + * Controller functions > > + */ > > + > > +/* > > + * Start or stop the kdamond > > + * > > + * Returns 0 if success, negative error code otherwise. > > + */ > > +static int damon_turn_kdamond(struct damon_ctx *ctx, bool on) > > +{ > > + spin_lock(&ctx->kdamond_lock); > > + ctx->kdamond_stop = !on; > > Can't use the kthread_stop / kthread_should_stop approach? Will use it. > > > + if (!ctx->kdamond && on) { > > + ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond"); > > + if (!ctx->kdamond) > > + goto fail; > > + goto success; > > cleaner as > int ret = 0; above then > > if (!ctx->kdamond) > ret = -EINVAL; > goto unlock; > > with > > unlock: > spin_unlock(&ctx->dmanond_lock); > return ret; Agreed, will change so. > > > + } > > + if (ctx->kdamond && !on) { > > + spin_unlock(&ctx->kdamond_lock); > > + while (true) { > > An unbounded loop is probably a bad idea. Will add clear condition here. > > > + spin_lock(&ctx->kdamond_lock); > > + if (!ctx->kdamond) > > + goto success; > > + spin_unlock(&ctx->kdamond_lock); > > + > > + usleep_range(ctx->sample_interval, > > + ctx->sample_interval * 2); > > + } > > + } > > + > > + /* tried to turn on while turned on, or turn off while turned off */ > > + > > +fail: > > + spin_unlock(&ctx->kdamond_lock); > > + return -EINVAL; > > + > > +success: > > + spin_unlock(&ctx->kdamond_lock); > > + return 0; > > +} > > + > > +/* > > + * This function should not be called while the kdamond is running. > > + */ > > +static int damon_set_pids(struct damon_ctx *ctx, > > + unsigned long *pids, ssize_t nr_pids) > > +{ > > + ssize_t i; > > + struct damon_task *t, *next; > > + > > + damon_for_each_task_safe(ctx, t, next) > > + damon_destroy_task(t); > > + > > + for (i = 0; i < nr_pids; i++) { > > + t = damon_new_task(pids[i]); > > + if (!t) { > > + pr_err("Failed to alloc damon_task\n"); > > + return -ENOMEM; > > + } > > + damon_add_task_tail(ctx, t); > > + } > > + > > + return 0; > > +} > > + > > +/* > > This is kind of similar to kernel-doc formatting. Might as well just make > it kernel-doc! Agreed, will do so! > > > + * Set attributes for the monitoring > > + * > > + * sample_int time interval between samplings > > + * aggr_int time interval between aggregations > > + * min_nr_reg minimal number of regions > > + * > > + * This function should not be called while the kdamond is running. > > + * Every time interval is in micro-seconds. > > + * > > + * Returns 0 on success, negative error code otherwise. > > + */ > > +static int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, > > + unsigned long aggr_int, unsigned long min_nr_reg) > > +{ > > + if (min_nr_reg < 3) { > > + pr_err("min_nr_regions (%lu) should be bigger than 2\n", > > + min_nr_reg); > > + return -EINVAL; > > + } > > + > > + ctx->sample_interval = sample_int; > > + ctx->aggr_interval = aggr_int; > > + ctx->min_nr_regions = min_nr_reg; > > blank line helps readability a tiny little bit. Agreed! Thanks, SeongJae Park > > > + return 0; > > +} > > + > > static int __init damon_init(void) > > { > > pr_info("init\n"); > > diff --git a/mm/page_ext.c b/mm/page_ext.c > > index 4ade843ff588..71169b45bba9 100644 > > --- a/mm/page_ext.c > > +++ b/mm/page_ext.c > > @@ -131,6 +131,7 @@ struct page_ext *lookup_page_ext(const struct page *page) > > MAX_ORDER_NR_PAGES); > > return get_entry(base, index); > > } > > +EXPORT_SYMBOL_GPL(lookup_page_ext); > > > > static int __init alloc_node_page_ext(int nid) > > {