On Sat, 3 Feb 2018 17:04:14 +0000 (UTC) Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > The approach proposed here will introduce an expectation that internal > function signatures never change in the kernel, else it would break user-space > tools hooking on those functions. I had this exact discussion with Linus. Linus, please correct me if I'm wrong. This is a case where he said if someone expected a function to be there, than too bad. Functions can come and go depending on if gcc inlines it or not. We already have this interface today. It's the function tracer. One could argue a tool requires a function to exist because it depends on a function being accessible to the function tracer. > > The instrumentation infrastructure provided by this patchset might be useful > for "one off" scripts, but it does not address the "stable instrumentation" > expectations issue. Actually, it could work for adding a tracepoint. > > The problem today is caused by widely used trace analysis tools that cannot > cope with changes to the kernel instrumentation, do not report the > instrumentation mismatch compared to their expectations, and we generally > don't expect users to ever update those tools to deal with newer kernels. Having > those tools hook on function names/arguments will not make this magically go > away. As soon as kernel code changes, widely used trace analysis tools will > start breaking left and right, and we will be back to square one. Only this time, > it's the internal function signature which will have become an ABI. >From those that were asking about having "trace markers" (ie. Facebook), they told us they can cope with kernel changes. If a user can't cope with the changes, then they need to have their own custom kernels. > > A possible solution to this problem appears if we start considering trace > analysis tools as just that: "tooling", with the following properties: > > 1) Tools need to validate that the instrumentation provided matches their > expectations. This can be done by checking event/field names and/or version. > Tools that fail to do that should be fixed. > > 2) Tools need to report to the user when the instrumentation does not match > their expectations, and hint users to upgrade in order to deal with change. > > 3) Tools need to be backward compatible with respect to instrumentation: a > user switching between older and newer kernels should not have to keep > various copies of their tooling stack (graphical UI, analysis scripts, > and so on). > > If our goal is really to address this "stable instrumentation" issue, I don't > think hooking on functions helps in any way. I hope we can work on defining > instrumentation interface rules in order to deal with the fundamental problem > of requiring tooling to adapt to kernel changes. I think you may have mistaken my goal. It was not to establish stable instrumentation. In fact, it was the exact opposite. It was a way to avoid stable instrumentation but still be able to add trace events. The issue is that people are afraid to add tracepoints into their subsystem because they are afraid that they will become stable and limit their own development. The problem is that it hurts those that want to trace these subsystems who are perfectly fine with the tracepoints going away, and then they would need to change their tools. This change set was to help those that can customize their tools with new kernels. It was not to help those that just want their tools to work with all kernels. With that said, this actually can help those who want stable infrastructure as well. If there happens to be a function that is constantly used to create a dynamic function based event, it can then be shown to ask the sub system maintainer to add a static tracepoint there. As they can show that it is very useful to have. One problem we are having today is that too many trace events are being created, where there are a lot of them that have been used once and never used again. And people don't care about them. I want to slow down the addition of trace events if these function events can be used instead. And when they are not good enough, or we see that one is constantly being added, then we will know that we can add a trace event that would be useful in the future. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html