On Tue, Mar 1, 2022 at 11:46 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Mon, Feb 28, 2022 at 11:57:19PM +0200, Amir Goldstein wrote: > > On Mon, Feb 28, 2022 at 11:11 PM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > > > > On Mon, Feb 28, 2022 at 01:39:08PM +0200, Amir Goldstein wrote: > > > > Show optional collected per-mount io stats in /proc/<pid>/mountstats > > > > for filesystems that do not implement their own show_stats() method > > > > and opted-in to generic per-mount stats with FS_MOUNT_STATS flag. > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > --- > > > > fs/mount.h | 1 + > > > > fs/namespace.c | 2 ++ > > > > fs/proc_namespace.c | 13 +++++++++++++ > > > > 3 files changed, 16 insertions(+) > > > > > > > > diff --git a/fs/mount.h b/fs/mount.h > > > > index f98bf4cd5b1a..2ab6308af78b 100644 > > > > --- a/fs/mount.h > > > > +++ b/fs/mount.h > > > > @@ -91,6 +91,7 @@ struct mount { > > > > int mnt_id; /* mount identifier */ > > > > int mnt_group_id; /* peer group identifier */ > > > > int mnt_expiry_mark; /* true if marked for expiry */ > > > > + time64_t mnt_time; /* time of mount */ > > > > struct hlist_head mnt_pins; > > > > struct hlist_head mnt_stuck_children; > > > > } __randomize_layout; > > > > diff --git a/fs/namespace.c b/fs/namespace.c > > > > index 3fb8f11a42a1..546f07ed44c5 100644 > > > > --- a/fs/namespace.c > > > > +++ b/fs/namespace.c > > > > @@ -220,6 +220,8 @@ static struct mount *alloc_vfsmnt(const char *name) > > > > mnt->mnt_count = 1; > > > > mnt->mnt_writers = 0; > > > > #endif > > > > + /* For proc/<pid>/mountstats */ > > > > + mnt->mnt_time = ktime_get_seconds(); > > > > > > > > INIT_HLIST_NODE(&mnt->mnt_hash); > > > > INIT_LIST_HEAD(&mnt->mnt_child); > > > > diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c > > > > index 49650e54d2f8..d744fb8543f5 100644 > > > > --- a/fs/proc_namespace.c > > > > +++ b/fs/proc_namespace.c > > > > @@ -232,6 +232,19 @@ static int show_vfsstat(struct seq_file *m, struct vfsmount *mnt) > > > > if (sb->s_op->show_stats) { > > > > seq_putc(m, ' '); > > > > err = sb->s_op->show_stats(m, mnt_path.dentry); > > > > + } else if (mnt_has_stats(mnt)) { > > > > + /* Similar to /proc/<pid>/io */ > > > > + seq_printf(m, "\n" > > > > + "\ttimes: %lld %lld\n" > > > > + "\trchar: %lld\n" > > > > + "\twchar: %lld\n" > > > > + "\tsyscr: %lld\n" > > > > + "\tsyscw: %lld\n", > > > > + r->mnt_time, ktime_get_seconds(), > > > > + mnt_iostats_counter_read(r, MNTIOS_CHARS_RD), > > > > + mnt_iostats_counter_read(r, MNTIOS_CHARS_WR), > > > > + mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_RD), > > > > + mnt_iostats_counter_read(r, MNTIOS_SYSCALLS_WR)); > > > > > > This doesn't scale as {cpus, mounts, counters, read frequency} > > > matrix explodes. Please iterate the per-mount per cpu counters > > > once, adding up all counters in one pass to an array on stack, then > > > print them all from the array. > > > > I am planning to move to per-sb iostats and was thinking of using > > an array of 4 struct percpu_counter. That will make this sort of iteration > > more challenging. > > No, it would get rid of it entirely. percpu_counter_read_positive() > does not require any summing at all - that's a much better solution > than a hand rolled set of percpu counters. Please do this. > > > Do you really think the read frequency of /proc/self/mountstats > > warrants such performance optimization? > > We get bug reports every so often about the overhead of frequently > summing per-cpu stats on large systems. Nothing ratelimits or > restricts access to /proc/self/mountstats, so when you have a > thousand CPUs and a million monkeys... > > Rule of thumb: don't do computationally expensive things to generate > data for globally accessible sysfs files. > > > It's not like the case of the mighty struct xfsstats. > > It is only going to fold 4 per cpu iterations into 1. > > This doesn't look like a game changer to me. > > Am I missing something? > > I'm just pointing out something we've had problems with in > the past and are trying to help you avoid making the same mistakes. > That's what reviewers are supposed to do, yes? Yes, thank you :) Amir.