[This discussion started on the systemtap@xxxxxxxxxxxxxx mailing list.] > On Thu, 2007-08-02 at 14:19 -0700, Roland McGrath wrote: > > > Why not just prevent the handler from returning until you want the > > > thread to unblock? The handler could sleep on something trigger from > > > user space. > > > > That is a no-no for a utrace callback. It prevents other tracing > > facilities from doing anything useful. > > Yeah, you're pretty adamant about that in Documentation/utrace.txt, but > I always wondered why. Using gdb, I can stop a traced app and keep it > stopped all weekend. If nobody else is using the app, why should > anybody care? You can certainly keep the application stopped all year. You just have to do it the right way, by setting QUIESCE. What you can't do is hijack a user thread's scheduling context for your own purposes. That is not friendly. It prevents any other tracing engine (that is following the rules) from examining the thread, because it is not properly quiescent, just blocked somewhere in the kernel. If other tracing engines detach while you block, it prevents the data structures being cleaned up, for the same reason. It prevents anyone poking around the system with ps or whatnot from seeing a proper "T (tracing stop)" status to indicate why that process has not progressed all weekend. It breaks SIGKILL. If you do much work, other than block, it charges all those resources to the user thread instead of explicitly to instrumentation code, which may or may not matter, but is always relevant to the general issue. One of the key purposes of utrace is to make it easier not to screw up in all these ways. Look, bottom line, you are writing kernel module code, in an enforcement sense you already have carte blanche. But if you want me to change my definition of "well-behaved", it ain't gonna happen. > Sure, utrace allows multiple facilities to trace the same app, and we > may see circumstances where multiple users skew each others' results by > tracing the same app with blocking handlers. That doesn't seem like a > very good reason for preventing utrace handlers from doing useful stuff > that can't be accomplished without blocking. This misses most of the reasons well-behaved callbacks are important. (For the executive summary, "breaks SIGKILL" is all anyone really needs to know. Any debugging facility on Linux is broken by definition if it can ever prevent the timely completion of death by SIGKILL.) It also gives the impression that you think this API discipline for kernel modules in some way constrains what control you can exercise over user threads. That is not so (except you can't break SIGKILL). The utrace callbacks are the moral equivalent of interrupt handlers, ones that run with all other interrupts disabled. The handler utterly monopolizes the resource in question until it returns, in this case the thread rather than the CPU. They are not literally interrupt handlers, and by design run in a context that is as "safe" as it gets in kernel mode. But by analogy, considering the potential to break the system's behavior overall, they are in that vein. Think of writing a tracing engine as like writing a device driver. The devices you manipulate are user threads. You can put the device into a "frozen" state so it doesn't run off changing its state, but you don't do it by blocking in the interrupt handler. If instead of utrace callbacks, I had given you only a queued-event interface via something like an in-kernel socket, with "keep the thread quiescent until I dequeue this event and reply" being the only "immediate action" option, I suspect you would not be complaining. In either case, you have to structure your code in the same way. The main control flow of *your* logic takes place in a context *you* provide, either the calling thread from the userland debugger doing some control operation, or a kernel service thread you create. In this case, you also have to write the utrace callbacks that queue events or post wakeups for your main control code, as in a device interrupt handler. utrace is the *lowest layer*, it's not *the* interface. Perhaps most uses want something somewhere on the spectrum closer to the queued-event model, i.e. an interface giving a small set of things to do immediately on an event (that can be implemented in well-behaved ways) and making higher-level synchronization easy with your main control interface and with simple things like quiescing multiple threads. (I intend to produce a higher layer doing those things, but that is not the utrace layer.) My analogy to interrupt handlers is slightly dramatic. In fact, you can freely do all sorts of things in utrace callbacks. It's fine to do small blocks for memory allocation, even paging to access user memory, maybe even some i/o if you're sure it's quick. Anything that the thread could do itself from user mode, or in kernel code that is "instantaneous" (or at least uninterruptible) from the user-mode perspective. You can wait for a mutex/sem for some subsystem data or some lock of your own. If you have a locking bug and block forever there, it's not the end of the world. (It will break SIGKILL, and make the system take forever trying to do a graceful shutdown when you try to reboot.) Just no blocking until Tuesday, nor anything that is not close-by in your direct control with straightforward reasons to believe it won't ever block arbitrarily. Any kind of blocking as an explicit synchronization mechanism for the thread itself does not qualify. These are called rules for well-behaved tracing engines. They are there to help you mitigate the impact of bugs you can write, be kind to the environment, play well with others, etc. If you break the rules in a prototype, noone will send you to the stockade. Please help me with suggestions on how Documentation/utrace.txt can present this issue better. It seems it's managing to communicate "Roland doesn't want you to do it", but not why it's good for you, and for motherhood and apple pie, national security, and global child welfare, to refrain. Thanks, Roland