On Tue, 22 Mar 2022 15:35:54 +0100 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > On Tue, Mar 22, 2022 at 09:12:42AM -0400, Steven Rostedt wrote: > > > > Suppose: > > > > > > notrace func_B() > > > { > > > ... > > > } > > > > > > func_A() > > > { > > > ... > > > return func_B(); > > > } > > > > > > then inhibiting tail calls would end up looking like: > > > > If we inhibit tail calls, then we do not need to make func_B notrace. > > Dude, you're arguing in circles :-( the notrace was a given. Why is it notrace? Sorry to be dense, but it's not a given to me. > > > > func_A: > > > call __fentry__ > > > ... > > > call func_B > > > call __fexit__ > > > ret > > > > > > Then A is fully traced, B is invisible, as per spec. What is the > > > problem? > > > > The above is fine, but then func_B is not a tail call and can also be > > traced. > > Again, B is notrace as a given. This was all about how to deal with > notrace functions. No, it was about how to deal with notrace functions that were tail calls. That was what I had an issue with. Because the solution you proposed had func_A depend on func_B executing its call __fexit__ which it would not, if it was not traced. > > I suggested inhibiting tail-call to notrace, you said no. You now seem to > agree that solves it. I said inhibiting tail-calls was a solution, but only inhibiting it to notrace would probably have a significant performance impact. I thought you were talking about adding notrace to tail calls, not the other way around. Maybe that is our confusion in this conversation. > > > > The problem you initially had, of doing a tail-call into a notrace, was > > > that the __fexit__ call went missing, because notrace will obviously not > > > have that. But that's avoided by inhibiting all tail-calls between > > > notrace and !notrace functions (note that notrace must also not > > > tail-call !notrace). > > > > I'm confused by the above. Why can't a notrace tail call a !notrace? > > If we tail call to a > > > > func_B: > > call __fentry__ > > ... > > call __fexit__ > > ret > > > > then the fentry and fexit show a perfectly valid trace of func_B. > > Bah; I thought I had a case this morning, but now I can't seem to recall > :/ That happens to all of us ;-) > > > > Your worry seems to stem about loosing visiblilty of !notrace functions, > > > but AFAICT that doesn't happen. > > > > My worry is: > > > > func_A: > > call __fentry__ > > ... > > jmp func_B > > > > Where do we do the call __fexit__ ? > > In B (or wherever if B again does a tail-call). But there's no guarantee that the call __fexit__ will not be a nop in func_B. Remember, these are all turned on and off. If we just trace func_A and not func_B, we will never have a call __fexit__ for func_A. > > > That was the original concern, and I think the proposed solutions have > > convoluted our thoughts about what we are trying to fix. So let's go back > > to the beginning, and see how to deal with it. > > > > That is, we have: > > > > func_C: > > call __fenty__ > > ... > > call func_A: > > ... > > call func_B: > > ... > > call __fexit__ > > ret > > > > func_A: > > call __fentry__ > > ... > call __ftail__ > > jmp func_B > > > > func_B: > > call __fentry__ > > ... > > call __fexit__ > > ret > > > > Where the above is C calling A and B as normal functions, A calling B as a > > tail call and B just being a normal function called by both A and C (and > > many other functions). > > We need the __ftail__ thing to mark the trace-stack entry of func_A as > complete, then any future __fexit__ will be able to pop all completed > entries. > > In recap: > > __fentry__ -- push on trace-stack > __ftail__ -- mark top-most entry complete > __fexit__ -- mark top-most entry complete; > pop all completed entries Again, this would require that the tail-calls are also being traced. > > inhibit tail-calls to notrace. Just inhibiting tail-calls to notrace would work without any of the above. But my fear is that will cause a noticeable performance impact. > > > And note, I do not want to limit function tracing (which does not rely on > > __fexit__) just because we can't figure out how to handle __fexit__. > > I'm not following. Regular function tracing needs none of this. The regular function tracing does not need this. Only function graph tracing. I was thinking you were *adding* notrace to tail calls and such, which is what I was against. But apparently that is not what you were saying. > > It's function graph tracing, kretprobes and whatever else this rethook > stuff is about that needs this because return trampolines will stop > working somewhere in the not too distant future. Another crazy solution is to have: func_A: call __fentry__ ... tail: jmp 1f call 1f call __fexit__ ret 1: jmp func_B where the compiler tells us about "tail:" and that we know that func_A ends with a tail call, and if we want to trace the end of func_A we convert that jmp 1f into a nop. And then we call the func_B and it's return comes back to where we call __fexit__ and then return normally. -- Steve