The following patches are a cleaned up version of the work Marios Kogias first posted in August. http://www.spinics.net/lists/ceph-devel/msg19890.html The changes have been made against Ceph 0.80.1, and will be moved forward soon. With them Ceph can use Blkin, a library created by Marios Kogias and others, which enables tracking a specific request from the time it enters the system at higher levels till it is finally served by RADOS. In general, Blkin implements the tracing semantics described in the Dapper paper http://static.googleusercontent.com/media/research.google.com/el/pubs/archive/36356.pdf in order to trace the causal relationships between the different processing phases that an IO request may trigger. The goal is an end-to-end visualisation of the request's route in the system, accompanied by information concerning latencies in each processing phase. Thanks to LTTng this can happen with a minimal overhead and in realtime. In order to visualize the results Blkin was integrated with Twitter's Zipkin http://twitter.github.io/zipkin/ (which is a tracing system entirely based on Dapper). These patches can also be found in https://github.com/agshew/ceph/tree/wip-blkin In addition to cleanup, I've written a short document describing how to test Blkin tracing in Ceph (without Zipkin). See doc/dev/trace.rst Note that I have a question in to Marios concerning a compiler warning for ignoring the return value of write() in Message::init_trace_info(). The same calls also use a hardcoded file descriptor 3. I'm guessing this code was just used by him for debugging Blkin and can be removed, but I've left it for the moment. In the immediate future I plan to: - push a wip-blkin branch to github.com/ceph and take advantage of gitbuilder test/qa - move the changes forward to ceph:master - add Andreas' tracepoints https://github.com/ceph/ceph/pull/2877 using Blkin and investigate how easy it is to select the level of tracing detail Questions: 1. Did I split the patches into sensible groups? 2. How low is LTTng's overhead? Is it entirely eliminated when not enabled? Do we need to take advantage of something like the Linux kernel's CONFIG_DYNAMIC_FTRACE trick, where a special mcount() function is converted back and forth between a NOP and trace calls? See http://lwn.net/Articles/365835/ for a little more detail. 3. Also on the topic of performance, does the API for adding keyvalues need versions of annotations that used tracing functions with vectorized arguments? For instance, when many details about an event are required (e.g. read vs. write, length, etc.) or if multiple types of events are created simultaneously? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html