On Thu, 2020-02-06 at 20:26 +0800, Xiubo Li wrote: > On 2020/2/6 19:31, Jeff Layton wrote: > > On Thu, 2020-02-06 at 10:36 +0800, Xiubo Li wrote: > > > On 2020/2/6 5:43, Jeff Layton wrote: > > > > On Wed, 2020-01-29 at 03:27 -0500, xiubli@xxxxxxxxxx wrote: > > > [...] > > > > > + > > > > > +static int sending_metrics_get(void *data, u64 *val) > > > > > +{ > > > > > + struct ceph_fs_client *fsc = (struct ceph_fs_client *)data; > > > > > + struct ceph_mds_client *mdsc = fsc->mdsc; > > > > > + > > > > > + mutex_lock(&mdsc->mutex); > > > > > + *val = (u64)mdsc->sending_metrics; > > > > > + mutex_unlock(&mdsc->mutex); > > > > > + > > > > > + return 0; > > > > > +} > > > > > +DEFINE_SIMPLE_ATTRIBUTE(sending_metrics_fops, sending_metrics_get, > > > > > + sending_metrics_set, "%llu\n"); > > > > > + > > > > I'd like to hear more about how we expect users to use this facility. > > > > This debugfs file doesn't seem consistent with the rest of the UI, and I > > > > imagine if the box reboots you'd have to (manually) re-enable it after > > > > mount, right? Maybe this should be a mount option instead? > > > A mount option means we must do the unmounting to disable it. > > > > > Technically, no. You could wire it up so that you could enable and > > disable it via -o remount. For example: > > > > # mount -o remount,metrics=disabled > > Yeah, this is cool. > > > Another option might be a module parameter if this is something that you > > really want to be global (and not per-mount or per-session). > > > > > I was thinking with the debugfs file we can do the debug or tuning even > > > in the product setups at any time, usually this should be disabled since > > > it will send it per second. > > > > > Meh, one frame per second doesn't seem like it'll add much overhead. > Okay. > > Also, why one update per second? Should that interval be tunable? > > Per second just keep it the same with the fuse client. > Ok. > > > > Or we could merge the "sending_metric" to "metrics" UI, just writing > > > "enable"/"disable" to enable/disable sending the metrics to ceph, and > > > just like the "reset" does to clean the metrics. > > > > > > Then the "/sys/kernel/debug/ceph/XXX.clientYYY/metrics" could be > > > writable with: > > > > > > "reset" --> to clean and reset the metrics counters > > > > > > "enable" --> enable sending metrics to ceph cluster > > > > > > "disable" --> disable sending metrics to ceph cluster > > > > > > Will this be better ? > > > > > I guess it's not clear to me how you intend for this to be used. > > > > A debugfs switch means that this is being enabled and disabled on a per- > > session basis. Is the user supposed to turn this on for all, or just one > > session? How do they know? > > Not for all, just per-superblock. > If it's per-superblock, then a debugfs-based switch seems particularly ill-suited for this, as that's really a per-session interface. > > Is this something we expect people to just turn on briefly when they are > > experiencing a problem, or is this something that we expect to be turned > > on and left on for long periods of time? > > If this won't add much overhead even per second, let's keep sending the > metrics to ceph always and the mount option for this switch is not > needed any more. > Note that I don't really _know_ that it won't be a problem, just that it doesn't sound too bad. I think we probably will want some mechanism to enable/disable this until we have some experience with it in the field. > And there is already a switch to enable/disable showing the metrics in > the ceph side, if here add another switch per client, it will be also > yucky for admins . > > Let's make the update interval tunable and per second as default. Maybe > we should make this as a global UI for all clients ? > If you want a global setting for the interval that would take effect on all ceph mounts, then maybe a "metric_send_interval" module parameter would be best. Make it an unsigned int, and allow the admin to set it to 0 to turn off stats transmission in the client. We have a well-defined interface for setting module parameters on most distros (via /etc/modprobe.d/), so that would be better than monkeying around with debugfs here, IMO. As to the default, it might be best to have this default to 0 initially. Once we have more experience with it we could make it default to 1 in a later release. -- Jeff Layton <jlayton@xxxxxxxxxx>