On Thu, 2021-10-28 at 15:25 +0100, Luís Henriques wrote: > On Thu, Oct 28, 2021 at 08:41:52AM -0400, Jeff Layton wrote: > > On Thu, 2021-10-28 at 12:48 +0100, Luís Henriques wrote: > > > This patch adds latency and size metrics for remote object copies > > > operations ("copyfrom"). For now, these metrics will be available on the > > > client only, they won't be sent to the MDS. > > > > > > Cc: Patrick Donnelly <pdonnell@xxxxxxxxxx> > > > Signed-off-by: Luís Henriques <lhenriques@xxxxxxx> > > > --- > > > This patch is still an RFC because it is... ugly. Although it now > > > provides nice values (latency and size) using the metrics infrastructure, > > > it actually needs to extend the ceph_osdc_copy_from() function to add 2 > > > extra args! That's because we need to get the timestamps stored in > > > ceph_osd_request, which is handled within that function. > > > > > > The alternative is to ignore those timestamps and collect new ones in > > > ceph_do_objects_copy(): > > > > > > start_req = ktime_get(); > > > ceph_osdc_copy_from(...); > > > end_req = ktime_get(); > > > > > > These would be more coarse-grained, of course. Any other suggestions? > > > > > > > Not really. It is definitely ugly, I'll grant you that though... > > > > The cleaner method might be to just inline ceph_osdc_copy_from in > > ceph_do_objects_copy so that you deal with the req in there. > > Yeah, but the reason for having these 2 functions was to keep net/ceph/ > code free from cephfs-specific code. Inlining ceph_osdc_copy_from would > need to bring some extra FS knowledge into libceph.ko. Right now the > funcion in osd_client receives only the required args for doing a copyfrom > operation. (But TBH it's possible that libceph already contains several > bits that are cephfs or rbd specific.) > Oh, I was more just suggesting that you just copy the guts out of ceph_osdc_copy_from() and paste them into the only caller (ceph_do_objects_copy). That would give you access to the OSD req field in ceph_do_objects_copy and you could just copy the appropriate fields there. > However, I just realized that I do have some code here that changes > ceph_osdc_copy_from() to return the OSD req struct. The caller would then > be responsible for doing the ceph_osdc_wait_request(). This code was from > my copy_file_range parallelization patch (which I should revisit one of > these days), but could be reused here. Do you think it would be > acceptable? > Yeah, that would work too. The problem you have is that the OSD request is driven by ceph_osdc_copy_from, and you probably want to do that in ceph_do_objects_copy instead so you can get to the timestamp fields. > <...> > > > + spinlock_t copyfrom_metric_lock; > > > + u64 total_copyfrom; > > > + u64 copyfrom_size_sum; > > > + u64 copyfrom_size_min; > > > + u64 copyfrom_size_max; > > > + ktime_t copyfrom_latency_sum; > > > + ktime_t copyfrom_latency_sq_sum; > > > + ktime_t copyfrom_latency_min; > > > + ktime_t copyfrom_latency_max; > > > + > > > > Not a comment about your patch, specifically, but we have a lot of > > copy/pasted code to deal with different parts of ceph_client_metric. > > > > It might be nice to eventually turn each of the read/write/copy metric > > blocks in this struct into an array, and collapse a lot of the other > > helper functions together. > > > > If you feel like doing that cleanup, I'd be happy to review. Otherwise, > > I'll plan to look at it in the near future. > > Yeah, sure. I can have a look at that too. > > Cheers, > -- > Luís -- Jeff Layton <jlayton@xxxxxxxxxx>