On Mon, May 09, 2022 at 11:38:20AM -0700, Roman Gushchin wrote: > Add a scan interface which allows to trigger scanning of a particular > shrinker and specify memcg and numa node. It's useful for testing, > debugging and profiling of a specific scan_objects() callback. > Unlike alternatives (creating a real memory pressure and dropping > caches via /proc/sys/vm/drop_caches) this interface allows to interact > with only one shrinker at once. Also, if a shrinker is misreporting > the number of objects (as some do), it doesn't affect scanning. > > Signed-off-by: Roman Gushchin <roman.gushchin@xxxxxxxxx> > --- > .../admin-guide/mm/shrinker_debugfs.rst | 39 +++++++++- > mm/shrinker_debug.c | 73 +++++++++++++++++++ > 2 files changed, 108 insertions(+), 4 deletions(-) > > diff --git a/Documentation/admin-guide/mm/shrinker_debugfs.rst b/Documentation/admin-guide/mm/shrinker_debugfs.rst > index 6783f8190e63..8fecf81d60ee 100644 > --- a/Documentation/admin-guide/mm/shrinker_debugfs.rst > +++ b/Documentation/admin-guide/mm/shrinker_debugfs.rst > @@ -5,14 +5,16 @@ Shrinker Debugfs Interface > ========================== > > Shrinker debugfs interface provides a visibility into the kernel memory > -shrinkers subsystem and allows to get information about individual shrinkers. > +shrinkers subsystem and allows to get information about individual shrinkers > +and interact with them. > > For each shrinker registered in the system a directory in **<debugfs>/shrinker/** > is created. The directory's name is composed from the shrinker's name and an > unique id: e.g. *kfree_rcu-0* or *sb-xfs:vda1-36*. > > -Each shrinker directory contains the **count** file, which allows to trigger > -the *count_objects()* callback for each memcg and numa node (if applicable). > +Each shrinker directory contains **count** and **scan** files, which allow to > +trigger *count_objects()* and *scan_objects()* callbacks for each memcg and > +numa node (if applicable). > > Usage: > ------ > @@ -43,7 +45,7 @@ Usage: > > $ cd sb-btrfs\:vda2-24/ > $ ls > - count > + count scan > > 3. *Count objects* > > @@ -98,3 +100,32 @@ Usage: > 2877 84 0 > 293 1 0 > 735 8 0 > + > +4. *Scan objects* > + > + The expected input format:: > + > + <cgroup inode id> <numa id> <number of objects to scan> > + > + For a non-memcg-aware shrinker or on a system with no memory > + cgrups **0** should be passed as cgroup id. > + :: > + > + $ cd /sys/kernel/debug/shrinker/ > + $ cd sb-btrfs\:vda2-24/ > + > + $ cat count | head -n 5 > + 1 212 0 > + 21 97 0 > + 55 802 5 > + 2367 2 0 > + 225 13 0 > + > + $ echo "55 0 200" > scan > + > + $ cat count | head -n 5 > + 1 212 0 > + 21 96 0 > + 55 752 5 > + 2367 2 0 > + 225 13 0 > diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c > index 28b1c1ab60ef..8f67fef5a643 100644 > --- a/mm/shrinker_debug.c > +++ b/mm/shrinker_debug.c > @@ -101,6 +101,77 @@ static int shrinker_debugfs_count_show(struct seq_file *m, void *v) > } > DEFINE_SHOW_ATTRIBUTE(shrinker_debugfs_count); > > +static int shrinker_debugfs_scan_open(struct inode *inode, struct file *file) > +{ > + file->private_data = inode->i_private; > + return nonseekable_open(inode, file); > +} > + > +static ssize_t shrinker_debugfs_scan_write(struct file *file, > + const char __user *buf, > + size_t size, loff_t *pos) > +{ > + struct shrinker *shrinker = (struct shrinker *)file->private_data; Seems we could drop the cast since ->private_data is void * type. > + unsigned long nr_to_scan = 0, ino; > + struct shrink_control sc = { > + .gfp_mask = GFP_KERNEL, > + }; > + struct mem_cgroup *memcg = NULL; > + int nid; > + char kbuf[72]; > + int read_len = size < (sizeof(kbuf) - 1) ? size : (sizeof(kbuf) - 1); > + ssize_t ret; > + > + if (copy_from_user(kbuf, buf, read_len)) > + return -EFAULT; > + kbuf[read_len] = '\0'; > + > + if (sscanf(kbuf, "%lu %d %lu", &ino, &nid, &nr_to_scan) < 2) > + return -EINVAL; > + > + if (nid < 0 || nid >= nr_node_ids) > + return -EINVAL; > + Should we break here if nr_to_scan is zero? > + if (shrinker->flags & SHRINKER_MEMCG_AWARE) { > + memcg = mem_cgroup_get_from_ino(ino); > + if (!memcg || IS_ERR(memcg)) Should we drop the check of "!memcg" since mem_cgroup_get_from_ino cannot return NULL? > + return -ENOENT; > + > + if (!mem_cgroup_online(memcg)) { > + mem_cgroup_put(memcg); > + return -ENOENT; > + } > + } else { > + if (ino != 0) > + return -EINVAL; > + memcg = NULL; IIUC, memcg is already NULL if we reach here, right? Then the assignment is not necessary. Or we cound remove the initialization of 'memcg' where it is definned. > + } > + > + ret = down_read_killable(&shrinker_rwsem); > + if (ret) { > + mem_cgroup_put(memcg); > + return ret; > + } > + > + sc.nid = nid; > + sc.memcg = memcg; > + sc.nr_to_scan = nr_to_scan; > + sc.nr_scanned = nr_to_scan; > + > + shrinker->scan_objects(shrinker, &sc); > + > + up_read(&shrinker_rwsem); > + mem_cgroup_put(memcg); > + > + return ret ? ret : size; Seems "ret" is always equal to 0 here, should we simplify this to "return size"? Thanks. > +} > + > +static const struct file_operations shrinker_debugfs_scan_fops = { > + .owner = THIS_MODULE, > + .open = shrinker_debugfs_scan_open, > + .write = shrinker_debugfs_scan_write, > +}; > + > int shrinker_debugfs_add(struct shrinker *shrinker) > { > struct dentry *entry; > @@ -130,6 +201,8 @@ int shrinker_debugfs_add(struct shrinker *shrinker) > > debugfs_create_file("count", 0220, entry, shrinker, > &shrinker_debugfs_count_fops); > + debugfs_create_file("scan", 0440, entry, shrinker, > + &shrinker_debugfs_scan_fops); > return 0; > } > > -- > 2.35.3 > >