Re: [RFC PATCH] ceph: add remote object copy counter to fs client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 25, 2021 at 06:20:40AM -0400, Jeff Layton wrote:
> On Mon, 2021-10-25 at 11:12 +0100, Luís Henriques wrote:
> > On Thu, Oct 21, 2021 at 12:35:18PM -0400, Jeff Layton wrote:
> > > On Thu, 2021-10-21 at 12:18 -0400, Patrick Donnelly wrote:
> > > > On Thu, Oct 21, 2021 at 11:44 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > 
> > > > > On Thu, 2021-10-21 at 09:52 -0400, Patrick Donnelly wrote:
> > > > > > On Wed, Oct 20, 2021 at 12:27 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > > > > > > 
> > > > > > > On Wed, 2021-10-20 at 15:37 +0100, Luís Henriques wrote:
> > > > > > > > This counter will keep track of the number of remote object copies done on
> > > > > > > > copy_file_range syscalls.  This counter will be filesystem per-client, and
> > > > > > > > can be accessed from the client debugfs directory.
> > > > > > > > 
> > > > > > > > Cc: Patrick Donnelly <pdonnell@xxxxxxxxxx>
> > > > > > > > Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
> > > > > > > > ---
> > > > > > > > This is an RFC to reply to Patrick's request in [0].  Note that I'm not
> > > > > > > > 100% sure about the usefulness of this patch, or if this is the best way
> > > > > > > > to provide the functionality Patrick requested.  Anyway, this is just to
> > > > > > > > get some feedback, hence the RFC.
> > > > > > > > 
> > > > > > > > Cheers,
> > > > > > > > --
> > > > > > > > Luís
> > > > > > > > 
> > > > > > > > [0] https://github.com/ceph/ceph/pull/42720
> > > > > > > > 
> > > > > > > 
> > > > > > > I think this would be better integrated into the stats infrastructure.
> > > > > > > 
> > > > > > > Maybe you could add a new set of "copy" stats to struct
> > > > > > > ceph_client_metric that tracks the total copy operations done, their
> > > > > > > size and latency (similar to read and write ops)?
> > > > > > 
> > > > > > I think it's a good idea to integrate this into "stats" but I think a
> > > > > > local debugfs file for some counters is still useful. The "stats"
> > > > > > module is immature at this time and I'd rather not build any qa tests
> > > > > > (yet) that rely on it.
> > > > > > 
> > > > > > Can we generalize this patch-set to a file named "op_counters" or
> > > > > > similar and additionally add other OSD ops performed by the kclient?
> > > > > > 
> > > > > 
> > > > > 
> > > > > Tracking this sort of thing is the main purpose of the stats code. I'm
> > > > > really not keen on adding a whole separate set of files for reporting
> > > > > this.
> > > > 
> > > > Maybe I'm confused. Is there some "file" which is already used for
> > > > this type of debugging information? Or do you mean the code for
> > > > sending stats to the MDS to support cephfs-top?
> > > > 
> > > > > What's the specific problem with relying on the data in debugfs
> > > > > "metrics" file?
> > > > 
> > > > Maybe no problem? I wasn't aware of a "metrics" file.
> > > > 
> > > 
> > > Yes. For instance:
> > > 
> > > # cat /sys/kernel/debug/ceph/*/metrics
> > > item                               total
> > > ------------------------------------------
> > > opened files  / total inodes       0 / 4
> > > pinned i_caps / total inodes       5 / 4
> > > opened inodes / total inodes       0 / 4
> > > 
> > > item          total       avg_lat(us)     min_lat(us)     max_lat(us)     stdev(us)
> > > -----------------------------------------------------------------------------------
> > > read          0           0               0               0               0
> > > write         5           914013          824797          1092343         103476
> > > metadata      79          12856           1572            114572          13262
> > > 
> > > item          total       avg_sz(bytes)   min_sz(bytes)   max_sz(bytes)  total_sz(bytes)
> > > ----------------------------------------------------------------------------------------
> > > read          0           0               0               0               0
> > > write         5           4194304         4194304         4194304         20971520
> > > 
> > > item          total           miss            hit
> > > -------------------------------------------------
> > > d_lease       11              0               29
> > > caps          5               68              10702
> > > 
> > > 
> > > I'm proposing that Luis add new lines for "copy" to go along with the
> > > "read" and "write" ones. The "total" counter should give you a count of
> > > the number of operations.
> > 
> > The problem with this is that it will require quite some work on the
> > MDS-side because, AFAIU, the MDS will need to handle different versions of
> > the CEPH_MSG_CLIENT_METRICS message (with and without the new copy-from
> > metrics).
> > 
> > Will this extra metric ever be useful on the MDS side?  From what I
> > understood Patrick's initial request was to have a way to find out, on the
> > client, if remote copies are really happening.  (*sigh* for not having
> > tracepoints.)
> > 
> > Anyway, I can look into adding this to the metrics infrastructure, but
> > it'll likely take me some more time to get to it and to figure out (once
> > again) how the messages versioning work.
> > 
> 
> I think it is useful info to report to the MDS, but it's not required to
> send these to the MDS to solve the current problem. My suggestion would
> be to add what's needed to track these stats in the kclient and report
> them via debugfs, but don't send the info to the MDS just yet.
> 
> Later, we could extend the protocol with COPY stats, and add the
> necessary infrastructure to the MDS to deal with it. Once that's in
> place, we can then extend the kclient to start sending this info along
> when it reports the stats.

Awesome, that sounds good to me.  I'll look into re-writing this patch
following your suggestion.  Thanks!

Cheers,
--
Luís



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux