Re: [PATCH 1/2] fs/ceph/debugfs: make all files world-readable

Max Kellermann <max.kellermann@xxxxxxxxx> · Wed, 27 Sep 2023 13:22:01 +0200

On Wed, Sep 27, 2023 at 12:53 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> > This "ceph" tool requires installing 90 MB of additional Debian
> > packages, which I just tried on a test cluster, and "ceph fs top"
> > fails with "Error initializing cluster client: ObjectNotFound('RADOS
> > object not found (error calling conf_read_file)')". Okay, so I have to
> > configure something.... but .... I don't get why I would want to do
> > that, when I can get the same information from the kernel without
> > installing or configuring anything. This sounds like overcomplexifying
> > the thing for no reason.
>
> I have relayed my understanding of this feature (or rather how it was
> presented to me).  I see where you are coming from, so adding more
> CephFS folks to chime in.

Let me show these folks how badly "ceph fs stats" performs:

 # time ceph fs perf stats
 {"version": 2, "global_counters": ["cap_hit", "read_latency",
"write_latency"[...]
 real    0m0.502s
 user    0m0.393s
 sys    0m0.053s

Now my debugfs-based solution:

 # time cat /sys/kernel/debug/ceph/*/metrics/latency
 item          total       avg_lat(us)     min_lat(us)     max_lat(us)
    stdev(us)
 [...]
 real    0m0.002s
 user    0m0.002s
 sys    0m0.001s

debugfs is more than 200 times faster. It is so fast, it can hardly be
measured by "time" - and most of these 2ms is the overhead for
executing /bin/cat, not for actually reading the debugfs file.
Our kernel-exporter is a daemon process, it only needs a single
pread() system call in each iteration, it has even less overhead.
Integrating the "ceph" tool instead would require forking the process
each time, starting a new Python VM, and so on...
For obtaining real-time latency statistics, the "ceph" script is the
wrong tool for the job.

Max