Re: [PATCH 1/2] ceph: periodically send perf metrics to ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020/6/16 21:40, Jeff Layton wrote:
On Tue, 2020-06-16 at 08:52 -0400,xiubli@xxxxxxxxxx  wrote:
From: Xiubo Li<xiubli@xxxxxxxxxx>

This will send the caps/read/write/metadata metrics to any available
MDS only once per second as default, which will be the same as the
userland client, or every metric_send_interval seconds, which is a
module parameter.

URL:https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li<xiubli@xxxxxxxxxx>
---
  fs/ceph/mds_client.c         |  46 +++++++++------
  fs/ceph/mds_client.h         |   4 ++
  fs/ceph/metric.c             | 133 +++++++++++++++++++++++++++++++++++++++++++
  fs/ceph/metric.h             |  78 +++++++++++++++++++++++++
  fs/ceph/super.c              |  29 ++++++++++
  include/linux/ceph/ceph_fs.h |   1 +
  6 files changed, 274 insertions(+), 17 deletions(-)

Long patch! Might not hurt to break out some of the cleanups into
separate patches, but they're fairly straighforward so I won't require
it.

Sure, I will split the patch into small ones if possible.


[...]

  /* This is the global metrics */
  struct ceph_client_metric {
  	atomic64_t            total_dentries;
@@ -35,8 +100,21 @@ struct ceph_client_metric {
  	ktime_t metadata_latency_sq_sum;
  	ktime_t metadata_latency_min;
  	ktime_t metadata_latency_max;
+
+	struct delayed_work delayed_work;  /* delayed work */
  };
+static inline void metric_schedule_delayed(struct ceph_client_metric *m)
+{
+	/* per second as default */
+	unsigned int hz = round_jiffies_relative(HZ * metric_send_interval);
+
+	if (!metric_send_interval)
+		return;
+
+	schedule_delayed_work(&m->delayed_work, hz);
+}
+
Should this also be gated on a MDS feature bit or anything? What happens
if we're running against a really old MDS that doesn't support these
stats? Does it just drop them on the floor? Should we disable sending
them in that case?

Tested without metric code in the ceph and when ceph saw a unknown type message, it will close the socket connection directly.


  extern int ceph_metric_init(struct ceph_client_metric *m);
  extern void ceph_metric_destroy(struct ceph_client_metric *m);
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index c9784eb1..66a940c 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1282,6 +1282,35 @@ static void __exit exit_ceph(void)
  	destroy_caches();
  }
+static int param_set_metric_interval(const char *val, const struct kernel_param *kp)
+{
+	int ret;
+	unsigned int interval;
+
+	ret = kstrtouint(val, 0, &interval);
+	if (ret < 0) {
+		pr_err("Failed to parse metric interval '%s'\n", val);
+		return ret;
+	}
+
+	if (interval > 5 || interval < 1) {
+		pr_err("Invalid metric interval %u\n", interval);
+		return -EINVAL;
+	}
+
+	metric_send_interval = interval;
+	return 0;
+}
+
+static const struct kernel_param_ops param_ops_metric_interval = {
+	.set = param_set_metric_interval,
+	.get = param_get_uint,
+};
+
+unsigned int metric_send_interval = 1;
+module_param_cb(metric_send_interval, &param_ops_metric_interval, &metric_send_interval, 0644);
+MODULE_PARM_DESC(metric_send_interval, "Interval (in seconds) of sending perf metric to ceph cluster, valid values are 1~5 (default: 1)");
+
Aren't valid values 0-5, with 0 disabling this feature? That's
what metric_schedule_delayed() seems to indicate...

What value should it be as default ? 0 to disable it or 1 ?

Maybe in future we should let the ceph side send a notification/request to all the clients if it wants to collect the metrics and then we could enable it here, and defautly just disable it.


  module_init(init_ceph);
  module_exit(exit_ceph);
diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index ebf5ba6..455e9b9 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -130,6 +130,7 @@ struct ceph_dir_layout {
  #define CEPH_MSG_CLIENT_REQUEST         24
  #define CEPH_MSG_CLIENT_REQUEST_FORWARD 25
  #define CEPH_MSG_CLIENT_REPLY           26
+#define CEPH_MSG_CLIENT_METRICS         29
  #define CEPH_MSG_CLIENT_CAPS            0x310
  #define CEPH_MSG_CLIENT_LEASE           0x311
  #define CEPH_MSG_CLIENT_SNAP            0x312





[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux