On Wed, 2020-06-17 at 02:14 +0800, Xiubo Li wrote: > On 2020/6/16 21:40, Jeff Layton wrote: > > On Tue, 2020-06-16 at 08:52 -0400,xiubli@xxxxxxxxxx wrote: > > > From: Xiubo Li<xiubli@xxxxxxxxxx> > > > > > > This will send the caps/read/write/metadata metrics to any available > > > MDS only once per second as default, which will be the same as the > > > userland client, or every metric_send_interval seconds, which is a > > > module parameter. > > > > > > URL:https://tracker.ceph.com/issues/43215 > > > Signed-off-by: Xiubo Li<xiubli@xxxxxxxxxx> > > > --- > > > fs/ceph/mds_client.c | 46 +++++++++------ > > > fs/ceph/mds_client.h | 4 ++ > > > fs/ceph/metric.c | 133 +++++++++++++++++++++++++++++++++++++++++++ > > > fs/ceph/metric.h | 78 +++++++++++++++++++++++++ > > > fs/ceph/super.c | 29 ++++++++++ > > > include/linux/ceph/ceph_fs.h | 1 + > > > 6 files changed, 274 insertions(+), 17 deletions(-) > > > > > Long patch! Might not hurt to break out some of the cleanups into > > separate patches, but they're fairly straighforward so I won't require > > it. > > Sure, I will split the patch into small ones if possible. > > > > [...] > > > > > /* This is the global metrics */ > > > struct ceph_client_metric { > > > atomic64_t total_dentries; > > > @@ -35,8 +100,21 @@ struct ceph_client_metric { > > > ktime_t metadata_latency_sq_sum; > > > ktime_t metadata_latency_min; > > > ktime_t metadata_latency_max; > > > + > > > + struct delayed_work delayed_work; /* delayed work */ > > > }; > > > > > > +static inline void metric_schedule_delayed(struct ceph_client_metric *m) > > > +{ > > > + /* per second as default */ > > > + unsigned int hz = round_jiffies_relative(HZ * metric_send_interval); > > > + > > > + if (!metric_send_interval) > > > + return; > > > + > > > + schedule_delayed_work(&m->delayed_work, hz); > > > +} > > > + > > Should this also be gated on a MDS feature bit or anything? What happens > > if we're running against a really old MDS that doesn't support these > > stats? Does it just drop them on the floor? Should we disable sending > > them in that case? > > Tested without metric code in the ceph and when ceph saw a unknown type > message, it will close the socket connection directly. > Ouch, that sounds bad. How does the userland client handle this? This seems like the kind of thing we usually add a feature bit for somewhere... > > > > extern int ceph_metric_init(struct ceph_client_metric *m); > > > extern void ceph_metric_destroy(struct ceph_client_metric *m); > > > > > > diff --git a/fs/ceph/super.c b/fs/ceph/super.c > > > index c9784eb1..66a940c 100644 > > > --- a/fs/ceph/super.c > > > +++ b/fs/ceph/super.c > > > @@ -1282,6 +1282,35 @@ static void __exit exit_ceph(void) > > > destroy_caches(); > > > } > > > > > > +static int param_set_metric_interval(const char *val, const struct kernel_param *kp) > > > +{ > > > + int ret; > > > + unsigned int interval; > > > + > > > + ret = kstrtouint(val, 0, &interval); > > > + if (ret < 0) { > > > + pr_err("Failed to parse metric interval '%s'\n", val); > > > + return ret; > > > + } > > > + > > > + if (interval > 5 || interval < 1) { > > > + pr_err("Invalid metric interval %u\n", interval); > > > + return -EINVAL; > > > + } > > > + > > > + metric_send_interval = interval; > > > + return 0; > > > +} > > > + > > > +static const struct kernel_param_ops param_ops_metric_interval = { > > > + .set = param_set_metric_interval, > > > + .get = param_get_uint, > > > +}; > > > + > > > +unsigned int metric_send_interval = 1; > > > +module_param_cb(metric_send_interval, ¶m_ops_metric_interval, &metric_send_interval, 0644); > > > +MODULE_PARM_DESC(metric_send_interval, "Interval (in seconds) of sending perf metric to ceph cluster, valid values are 1~5 (default: 1)"); > > > + > > Aren't valid values 0-5, with 0 disabling this feature? That's > > what metric_schedule_delayed() seems to indicate... > > What value should it be as default ? 0 to disable it or 1 ? > I don't have strong preference here. It's probably safe enough to make it 1. I do like the ability to turn this off though, as that gives us a workaround in the event that it does cause trouble. > Maybe in future we should let the ceph side send a notification/request > to all the clients if it wants to collect the metrics and then we could > enable it here, and defautly just disable it. > That seems like a heavier-weight solution than is called for here. This seems like something that ought to have a feature bit. Then if that's turned off, we can just disable this altogether. > > > > module_init(init_ceph); > > > module_exit(exit_ceph); > > > > > > diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h > > > index ebf5ba6..455e9b9 100644 > > > --- a/include/linux/ceph/ceph_fs.h > > > +++ b/include/linux/ceph/ceph_fs.h > > > @@ -130,6 +130,7 @@ struct ceph_dir_layout { > > > #define CEPH_MSG_CLIENT_REQUEST 24 > > > #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25 > > > #define CEPH_MSG_CLIENT_REPLY 26 > > > +#define CEPH_MSG_CLIENT_METRICS 29 > > > #define CEPH_MSG_CLIENT_CAPS 0x310 > > > #define CEPH_MSG_CLIENT_LEASE 0x311 > > > #define CEPH_MSG_CLIENT_SNAP 0x312 > > -- Jeff Layton <jlayton@xxxxxxxxxx>