On Wed, 23 Mar 2022 11:23:23 +0900 Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: > I see the __fexit__ is needed, but why __ftail__ is needed? I guess because > func_B is notrace, in that case the __fexit__ will not be in the func_B. > Am I correct? I believe Peter and I agreed that the "best" solution so far, that has the least amount of regressions (doesn't remove anything currently being function graph traced, nor removes current tail calls) is: > At that point giving us something like: > > 1: > pushsection __ftail_loc > .long 1b - . > popsection > > jmp.d32 func_B > call __fexit__ > ret Functions with a tail call will not have a __fexit__ and we can not rely on the function that is the tail call to do the __fexit__ for the parent function. Thus, the compromise is to add a label where the jmp to the tail-call function is, and when we want to trace the return of that function, we first have to patch the jmp into a call, which will then return back to the call to __fexit__. -- Steve