You might want to take a look at the Zipkin tracing hooks that are (semi)integrated into Ceph [1]. The hooks are disabled by default in release builds so you would need to rebuild Ceph yourself and then enable tracing via the 'rbd_blkin_trace_all = true' configuration option [2]. [1] http://victoraraujo.me/babeltrace-zipkin/ [2] https://github.com/ceph/ceph/blob/master/src/common/options.cc#L6275 On Tue, Apr 3, 2018 at 1:19 PM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > I was wondering if there is a mechanism to instrument an RBD workload to > elucidate what takes place on OSDs to troubleshoot performance issues > better. > > Currently, we can issue the RBD IO, such as via fio, and observe just the > overall performance. One needs to guess what OSDs that hits and try to find > from dump historic ops what is the bottleneck. > > It seems that integrating the timings into some sort of a debug flag for rbd > bench or fio would help a lot of us locate bottlenecks faster. > > Thanks, > Alex > > > -- > -- > Alex Gorbachev > Storcium > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com