<p>Hi Michel,</p><p>Thanks for the detailed answer! DBI tools are really interesting but I want to do this during normal execution and on multiple programs running simultaneously. I mean this is not supposed to be conventional tracing with multiple re-executions. I want to extract some information about the execution-state at runtime and inform the lower levels in the software stack to make smarter choices. Fortunately, there are only a few functions that need to be traced. But any reduction in the wasted cycles is helpful, specially if it is caused by privilege level transitions.</p>
<p>Regards.</p> <p> </p> <p>On 2020-07-16 05:36, Michel Dagenais wrote:</p><blockquote><!-- html ignored --><!-- head ignored --><!-- meta ignored -->
<div class="pre"><br /><blockquote>Without recompiling, how would that be implemented?</blockquote> <br /> As you mentioned, this is possible when "jump patching" 5 bytes instructions. Fast tracepoints in GDB and in kprobe do it. Kprobe goes further and patches sequences of instructions (because the target instruction is less than 5 bytes) if there is no incoming branch into the middle of the sequence. You can go even further, for instance using 3 bytes jumps to a trampoline installed in alignment nops. If you combine different strategies like this, you can eventually reach almost 100% success rate for "jump patching" tracepoints. This gets quite hairy though. However, the short story is that there is currently no tool as far as I know that does that easily and reliably in user space.<br /><br /><a href="https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746" target="_blank" rel="noopener noreferrer">https://onlinelibrary.wiley.com/doi/abs/10.1002/spe.2746</a><br /><a href="https://dl.acm.org/doi/pdf/10.1145/3062341.3062344" target="_blank" rel="noopener noreferrer">https://dl.acm.org/doi/pdf/10.1145/3062341.3062344</a><br /><br /> If you can afford a more invasive tool, that requires a lot of memory and stops your application for quite some time, you can look at approaches like dyninst that decompile the binary, insert instrumentation code and reassemble the code.<br /><br /><a href="https://dyninst.org/" target="_blank" rel="noopener noreferrer">https://dyninst.org/</a><br /><br /> <blockquote>You would need to insert a jump on top of code, and still be able to<br /> preserve that code. What a trap does, is to insert a int3, that will<br /> trap into the kernel, it would then emulate the code that the int3 was<br /> on, and also call some code that can trace the current state.<br /><br /> To do it in user land, you would need to find way to replace the code<br /> at the location you want to trace, with a jump to the tracing<br /> infrastructure, that will also be able to emulate the code that the<br /> jump was inserted on top of. As on x86, that jump will need to be 5<br /> bytes long (covering 5 bytes of text to emulate), where as a int3 is a<br /> single byte.<br /><br /> Thus, you either recompile and insert nops where you want to place your<br /> jumps, or you trap using int3 that can do the work from within the<br /> kernel.<br /><br /> -- Steve<br /> _______________________________________________<br /> lttng-dev mailing list<br /><a href="mailto:lttng-dev@xxxxxxxxxxxxxxx">lttng-dev@xxxxxxxxxxxxxxx</a><br /><a href="https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev" target="_blank" rel="noopener noreferrer">https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev</a></blockquote>
</div> </blockquote> <p> </p> <div id="_rc_sig"> </div>