On Tue, Oct 20, 2020 at 2:06 AM SeongJae Park <sjpark@xxxxxxxxxx> wrote: > > From: SeongJae Park <sjpark@xxxxxxxxx> > > DAMON is designed to be used by kernel space code such as the memory > management subsystems, and therefore it provides only kernel space API. > That said, letting the user space control DAMON could provide some > benefits to them. For example, it will allow user space to analyze > their specific workloads and make their own special optimizations. > > For such cases, this commit implements a simple DAMON application kernel > module, namely 'damon-dbgfs', which merely wraps the DAMON api and > exports those to the user space via the debugfs. > > 'damon-dbgfs' exports three files, ``attrs``, ``target_ids``, and > ``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``. > > Attributes > ---------- > > Users can read and write the ``sampling interval``, ``aggregation > interval``, ``regions update interval``, and min/max number of > monitoring target regions by reading from and writing to the ``attrs`` > file. For example, below commands set those values to 5 ms, 100 ms, > 1,000 ms, 10, 1000 and check it again:: > > # cd <debugfs>/damon > # echo 5000 100000 1000000 10 1000 > attrs > # cat attrs > 5000 100000 1000000 10 1000 > > Target IDs > ---------- > > Some types of address spaces supports multiple monitoring target. For > example, the virtual memory address spaces monitoring can have multiple > processes as the monitoring targets. Users can set the targets by > writing relevant id values of the targets to, and get the ids of the > current targets by reading from the ``target_ids`` file. In case of the > virtual address spaces monitoring, the values should be pids of the > monitoring target processes. For example, below commands set processes > having pids 42 and 4242 as the monitoring targets and check it again:: > > # cd <debugfs>/damon > # echo 42 4242 > target_ids > # cat target_ids > 42 4242 > > Note that setting the target ids doesn't start the monitoring. > > Turning On/Off > -------------- > > Setting the files as described above doesn't incur effect unless you > explicitly start the monitoring. You can start, stop, and check the > current status of the monitoring by writing to and reading from the > ``monitor_on`` file. Writing ``on`` to the file starts the monitoring > of the targets with the attributes. Writing ``off`` to the file stops > those. DAMON also stops if every targets are invalidated (in case of > the virtual memory monitoring, target processes are invalidated when > terminated). Below example commands turn on, off, and check the status > of DAMON:: > > # cd <debugfs>/damon > # echo on > monitor_on > # echo off > monitor_on > # cat monitor_on > off > > Please note that you cannot write to the above-mentioned debugfs files > while the monitoring is turned on. If you write to the files while > DAMON is running, an error code such as ``-EBUSY`` will be returned. > > Signed-off-by: SeongJae Park <sjpark@xxxxxxxxx> > Reviewed-by: Leonard Foerster <foersleo@xxxxxxxxx> > --- > include/linux/damon.h | 2 + > mm/damon/Kconfig | 9 + > mm/damon/Makefile | 1 + > mm/damon/core.c | 48 +++++ > mm/damon/dbgfs.c | 428 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 488 insertions(+) > create mode 100644 mm/damon/dbgfs.c > > diff --git a/include/linux/damon.h b/include/linux/damon.h > index 70cc4b54212e..d675ea908a02 100644 > --- a/include/linux/damon.h > +++ b/include/linux/damon.h > @@ -226,6 +226,8 @@ void damon_free_target(struct damon_target *t); > void damon_destroy_target(struct damon_target *t); > unsigned int damon_nr_regions(struct damon_target *t); > > +struct damon_ctx *damon_new_ctx(void); > +void damon_destroy_ctx(struct damon_ctx *ctx); > int damon_set_targets(struct damon_ctx *ctx, > unsigned long *ids, ssize_t nr_ids); > int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, > diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig > index 63b9c905b548..e38f95d28f74 100644 > --- a/mm/damon/Kconfig > +++ b/mm/damon/Kconfig > @@ -22,4 +22,13 @@ config DAMON_PRIMITIVES > The primitives support only virtual address spaces. If this cannot > cover your use case, you can implement and use your own primitives. > > +config DAMON_DBGFS > + bool "DAMON debugfs interface" > + depends on DAMON_PRIMITIVES && DEBUG_FS > + help > + This builds the debugfs interface for DAMON. The user space admins > + can use the interface for arbitrary data access monitoring. > + > + If unsure, say N. > + > endmenu > diff --git a/mm/damon/Makefile b/mm/damon/Makefile > index 2f3235a52e5e..2295deb2fe0e 100644 > --- a/mm/damon/Makefile > +++ b/mm/damon/Makefile > @@ -2,3 +2,4 @@ > > obj-$(CONFIG_DAMON) := core.o > obj-$(CONFIG_DAMON_PRIMITIVES) += primitives.o > +obj-$(CONFIG_DAMON_DBGFS) += dbgfs.o > diff --git a/mm/damon/core.c b/mm/damon/core.c > index d7957d8ff530..47baf859d7d9 100644 > --- a/mm/damon/core.c > +++ b/mm/damon/core.c > @@ -135,6 +135,40 @@ unsigned int damon_nr_regions(struct damon_target *t) > return nr_regions; > } > > +struct damon_ctx *damon_new_ctx(void) > +{ > + struct damon_ctx *ctx; > + > + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); > + if (!ctx) > + return NULL; > + > + ctx->sample_interval = 5 * 1000; > + ctx->aggr_interval = 100 * 1000; > + ctx->regions_update_interval = 1000 * 1000; > + ctx->min_nr_regions = 10; > + ctx->max_nr_regions = 1000; > + > + ktime_get_coarse_ts64(&ctx->last_aggregation); > + ctx->last_regions_update = ctx->last_aggregation; > + > + mutex_init(&ctx->kdamond_lock); > + > + INIT_LIST_HEAD(&ctx->targets_list); > + > + return ctx; > +} > + > +void damon_destroy_ctx(struct damon_ctx *ctx) > +{ > + struct damon_target *t, *next_t; > + > + damon_for_each_target_safe(t, next_t, ctx) > + damon_destroy_target(t); > + > + kfree(ctx); > +} > + > /** > * damon_set_targets() - Set monitoring targets. > * @ctx: monitoring context > @@ -204,6 +238,20 @@ int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, > return 0; > } > > +/** > + * damon_nr_running_ctxs() - Return number of currently running contexts. > + */ > +int damon_nr_running_ctxs(void) > +{ > + int nr_ctxs; > + > + mutex_lock(&damon_lock); > + nr_ctxs = nr_running_ctxs; > + mutex_unlock(&damon_lock); > + READ_ONCE() instead of mutex? > + return nr_ctxs; > +} > + > /* Returns the size upper limit for each monitoring region */ > static unsigned long damon_region_sz_limit(struct damon_ctx *ctx) > { > diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c > new file mode 100644 > index 000000000000..6316d4cae2a4 > --- /dev/null > +++ b/mm/damon/dbgfs.c > @@ -0,0 +1,428 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * DAMON Debugfs Interface > + * > + * Author: SeongJae Park <sjpark@xxxxxxxxx> > + */ > + > +#define pr_fmt(fmt) "damon-dbgfs: " fmt > + > +#include <linux/damon.h> > +#include <linux/debugfs.h> > +#include <linux/file.h> > +#include <linux/mm.h> > +#include <linux/module.h> > +#include <linux/page_idle.h> > +#include <linux/slab.h> > + > +static struct damon_ctx **dbgfs_ctxs; > +static int dbgfs_nr_ctxs = 1; > +static int dbgfs_nr_terminated_ctxs; > +static struct dentry **dbgfs_dirs; > +static DEFINE_MUTEX(damon_dbgfs_lock); > + > +/* > + * Returns non-empty string on success, negarive error code otherwise. > + */ > +static char *user_input_str(const char __user *buf, size_t count, loff_t *ppos) > +{ > + char *kbuf; > + ssize_t ret; > + > + /* We do not accept continuous write */ > + if (*ppos) > + return ERR_PTR(-EINVAL); > + > + kbuf = kmalloc(count + 1, GFP_KERNEL); > + if (!kbuf) > + return ERR_PTR(-ENOMEM); > + > + ret = simple_write_to_buffer(kbuf, count + 1, ppos, buf, count); > + if (ret != count) { > + kfree(kbuf); > + return ERR_PTR(-EIO); > + } > + kbuf[ret] = '\0'; > + > + return kbuf; > +} > + > +static ssize_t dbgfs_attrs_read(struct file *file, > + char __user *buf, size_t count, loff_t *ppos) > +{ > + struct damon_ctx *ctx = file->private_data; > + char kbuf[128]; > + int ret; > + > + mutex_lock(&ctx->kdamond_lock); > + ret = scnprintf(kbuf, ARRAY_SIZE(kbuf), "%lu %lu %lu %lu %lu\n", > + ctx->sample_interval, ctx->aggr_interval, > + ctx->regions_update_interval, ctx->min_nr_regions, > + ctx->max_nr_regions); > + mutex_unlock(&ctx->kdamond_lock); > + > + return simple_read_from_buffer(buf, count, ppos, kbuf, ret); > +} > + > +static ssize_t dbgfs_attrs_write(struct file *file, > + const char __user *buf, size_t count, loff_t *ppos) > +{ > + struct damon_ctx *ctx = file->private_data; > + unsigned long s, a, r, minr, maxr; > + char *kbuf; > + ssize_t ret = count; > + int err; > + > + kbuf = user_input_str(buf, count, ppos); > + if (IS_ERR(kbuf)) > + return PTR_ERR(kbuf); > + > + if (sscanf(kbuf, "%lu %lu %lu %lu %lu", > + &s, &a, &r, &minr, &maxr) != 5) { > + ret = -EINVAL; > + goto out; > + } > + > + mutex_lock(&ctx->kdamond_lock); > + if (ctx->kdamond) { > + ret = -EBUSY; > + goto unlock_out; > + } > + > + err = damon_set_attrs(ctx, s, a, r, minr, maxr); > + if (err) > + ret = err; > +unlock_out: > + mutex_unlock(&ctx->kdamond_lock); > +out: > + kfree(kbuf); > + return ret; > +} > + > +#define targetid_is_pid(ctx) \ > + (ctx->primitive.target_valid == damon_va_target_valid) > + > +static ssize_t sprint_target_ids(struct damon_ctx *ctx, char *buf, ssize_t len) > +{ > + struct damon_target *t; > + unsigned long id; > + int written = 0; > + int rc; > + > + damon_for_each_target(t, ctx) { > + id = t->id; > + if (targetid_is_pid(ctx)) > + /* Show pid numbers to debugfs users */ > + id = (unsigned long)pid_vnr((struct pid *)id); > + > + rc = scnprintf(&buf[written], len - written, "%lu ", id); > + if (!rc) > + return -ENOMEM; > + written += rc; > + } > + if (written) > + written -= 1; > + written += scnprintf(&buf[written], len - written, "\n"); > + return written; > +} > + > +static ssize_t dbgfs_target_ids_read(struct file *file, > + char __user *buf, size_t count, loff_t *ppos) > +{ > + struct damon_ctx *ctx = file->private_data; > + ssize_t len; > + char ids_buf[320]; > + > + mutex_lock(&ctx->kdamond_lock); > + len = sprint_target_ids(ctx, ids_buf, 320); > + mutex_unlock(&ctx->kdamond_lock); > + if (len < 0) > + return len; > + > + return simple_read_from_buffer(buf, count, ppos, ids_buf, len); > +} > + > +/* > + * Converts a string into an array of unsigned long integers > + * > + * Returns an array of unsigned long integers if the conversion success, or > + * NULL otherwise. > + */ > +static unsigned long *str_to_target_ids(const char *str, ssize_t len, > + ssize_t *nr_ids) > +{ > + unsigned long *ids; > + const int max_nr_ids = 32; > + unsigned long id; > + int pos = 0, parsed, ret; > + > + *nr_ids = 0; > + ids = kmalloc_array(max_nr_ids, sizeof(id), GFP_KERNEL); > + if (!ids) > + return NULL; > + while (*nr_ids < max_nr_ids && pos < len) { > + ret = sscanf(&str[pos], "%lu%n", &id, &parsed); > + pos += parsed; > + if (ret != 1) > + break; > + ids[*nr_ids] = id; > + *nr_ids += 1; > + } > + > + return ids; > +} > + > +/* Returns pid for the given pidfd if it's valid, or NULL otherwise. */ > +static struct pid *damon_get_pidfd_pid(unsigned int pidfd) > +{ > + struct fd f; > + struct pid *pid; > + > + f = fdget(pidfd); > + if (!f.file) > + return NULL; > + > + pid = pidfd_pid(f.file); > + if (!IS_ERR(pid)) > + get_pid(pid); > + else > + pid = NULL; > + > + fdput(f); > + return pid; > +} > + > +static ssize_t dbgfs_target_ids_write(struct file *file, > + const char __user *buf, size_t count, loff_t *ppos) > +{ > + struct damon_ctx *ctx = file->private_data; > + char *kbuf, *nrs; > + bool received_pidfds = false; > + unsigned long *targets; > + ssize_t nr_targets; > + ssize_t ret = count; > + int i; > + int err; > + > + kbuf = user_input_str(buf, count, ppos); > + if (IS_ERR(kbuf)) > + return PTR_ERR(kbuf); > + > + nrs = kbuf; > + > + if (!strncmp(kbuf, "pidfd ", 6)) { > + received_pidfds = true; I am inclining towards having simple pids instead of pidfds. Basically what cgroup/resctrl does. > + nrs = &kbuf[6]; > + } > + > + targets = str_to_target_ids(nrs, ret, &nr_targets); > + if (!targets) { > + ret = -ENOMEM; > + goto out; > + } > + > + if (received_pidfds) { > + for (i = 0; i < nr_targets; i++) > + targets[i] = (unsigned long)damon_get_pidfd_pid( > + (unsigned int)targets[i]); > + } else if (targetid_is_pid(ctx)) { > + for (i = 0; i < nr_targets; i++) > + targets[i] = (unsigned long)find_get_pid( > + (int)targets[i]); > + } > + > + mutex_lock(&ctx->kdamond_lock); > + if (ctx->kdamond) { > + ret = -EINVAL; > + goto unlock_out; > + } > + > + err = damon_set_targets(ctx, targets, nr_targets); Hmm this is leaking the references to the previous targets. > + if (err) > + ret = err; > +unlock_out: > + mutex_unlock(&ctx->kdamond_lock); > + kfree(targets); > +out: > + kfree(kbuf); > + return ret; > +} > + Still looking.